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SBQ 3 
SBQ 6 

SEQ a 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SSQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SEQ 44 
SEQ 83 

SEQ as 

Bacteria 

T44612 

WP_625402 

UB 295913 

AF320254 

OYE family 

Af4e75 

Af A9S1 

Ca2460 

NC4452 

ScOYEl 

ScfXf&Z 

SCOYE3 

A3 €990 



• — MTVAD IDVPPAEGIP 

■ — MSQPWPD lENKPAPGIS 

• — MGSNAFRS PAVTKSSSTP 
■ MALPD VENTPAAGIP 

■ tiTr/BYQVKPS DEIKGAPEVS 



YFTPAQNPPA 
YFTPAQEPPA 
YYTPWNGGA 
VETPAQNPPA 
YYTPEQPVPA 



-MADFTQiaC TSSPAAPGVP FYTPAQVPAA 



— — MSAEKK TIiSKPAAGVP 

. — jwjT rVNEGAENVG 

MTG TANKAAPGVP 

MAYEI IDNVAAEGVP 



YYtPAQEPPA 
YE-TPAQKlPA 
FYTPAQEPPA 
YYTPAQDPPA 



GTAANPQTM- - 
GTAAMPQSDG ■ 
ALHPDDPT — • 
GTAANPQTSG • 
GTFYP/2SSD- - 

MENN ■ 

GTPLesTEG- • 
■HftTST - 
GTPI.QQQDA- ■ 
GAAIG-VP — • 
GTPVDASTA- ■ 
GTQT3G ' 



-GQKIEKLF 
— SAPPKLF 

TPTLr 

— MAVPKLY 
—SVAPKir 
— NTIPALF 

DVPTI-r 

— TSDLKLS 

IPTLE* 

QTKLF 

PTLf 

STKLF 



TPLTIR-GVT 
RPLSVR-GLT 
RPliQIR-NVT 
TPLTVR-GVT 
QP1,KIG-KLA 
QPIKISDSIT 
TPLKIR-GVE 
QPLTLPNGLT 
KPLKIR-GVE 
TPLKIR-GVE 
KPLRIR-DLT 
TPITIR-GVT 



FQ MRLGLAPLCQ 

FH— NRIGLSPiCQ 

LK NRIMVSPMCM 

FH NRLGLAPLCQ 

2,p— — NRIGVSPMCQ 

j,p— ^ NRIGVSPMCM 

LQ- NRFAVAPMCT 

LP mLVKRM^tAB 

LS NRFGVSPMCT 

FHST NBMFVSPMCT 

Ili- NRIWVSPMCQ 

PP NRLPLAPLCQ 



— ^MP KCSAHGHHKI IINKEAPNVP FYTPVQDPPA GTSYDVQPEG ■ 
ARGI IDMIAAEGAP YYTPAQD.PA GTQTSGST— • 



-Slif SMKIR-HI.T LQ 

-KVF T.ITIR-GVT FP 

. — LKIR-GLT LQ 



HBIFVSPMCQ 

NRLFLAPLCQ 

NRIMLBGLCQ 



„„ -MSPPRFEAA PADPSPLG — TPLKY PVSGR — SAP HRFLNAAMSE 

MTVQSQQQSQ AIPVLSSQNG TEPQDANKBV VQNVAAKGVQ YFMPEQLPAP GLGINGPNNT LPKVF TPXKIR-GMT MP HRIWVSPMCQ 

MOTS RFVSGLTPPL VDSIDALKIS NFVPTRSGHP PPGSVPESIL PEGVKKPALF QTLTLP-FAA PEQAGKMTFK NRIIVSPMCQ 

'-MSAIiF EPYTLK-DVT LR NRIAIPPMCQ 

'-MSALF BPFBLB-DTT IP NRIWMPPMCQ 

PLLF TPLKLR-SLE LP NRVWSPMCT 

— PPPMF TPFKLR-GLT LA HRTVMSPMAM 



« „— _ — — — — — MTVSSAA APQPASPAA- > 

KYSMLT RSQRlSHBNL RLRDAGWLBG YERWtARKAG MTVRDDETP- ■ 



— MRBEPSSAQ LF KPLKVG — RC HLQ HRMIMAPTTR 

MTI RKLDGEESM LF QPLEIA-NGB IRLS HRWHAPHTR 

-MTVESTWS FWPA(5TKQI EIAPLGSTK LF QPIKVC-KWI LP HRVAHAPTTR 

MAATAAESR LF QPLKLTPKIT LQ HRLAMRPLTR 

MS FVKD F KPQALGDTM LF KPIKIG-NNE LL HRAVIPPLTR 

MP FVKD— e KPQALGDTN- LF KPIKIG-^fWE LL HRAVIPPLTR 

MP FVKG— — .-F EPISLRDTN- — LF EPIKIG-MTQ LA KRAVMPPLTR 

-hrriESTNS FWPSDTKLI DVTPLGSTK LF QPIKVG-NNV LP QRIAYVPTTR 



141 



151 



161 



171 



181 



191 



SEQ 3 
SEQ 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SBQ 16 
SEQ 19 
SBQ 22 
SBQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 33 
SEQ 38 
SBQ 40 
SEQ 42 
SEQ 44 
SBQ 83 
SBQ 85 
Bac&erla 
T44612 
NP_625402 
NP 295913 
AF3202S4 
OYB family 
Af4a75 
Af4961 
Ca2460 
Nc4452 
ScOYEl 

SCOYE3 
A3 6950 



YSA— 
YSA-> 



— QDGHM TD— YKIAHI* 
-DDGHM TP — ^WW'JAHli 



YSCE $ DPSSPHVGAL TN — YHLAHL 

ysA EDGHK TD— YHIAHIi 

YSA DYNFEA TP — YHLIHY 

YSS-' SPTDNQA TL — FHFVHY 

VSA- DDGHM TD— MHLVHL 

QMG- FGNHL PN—PBLAAV 

YSA DDGHL TO — FHLVHI. 

YSA DQBGHL TD~FHLVHL 

YSA — — DKGHA TD— YHLVHL 

YSA KDGYA TD— WHLTHL 



GGIAQRGPGi:. 
GGIAQRGPGF 

GHLM.KGAGL 
GGIAQRGPGI. 
GSLVNRGPGI 
GSFAVRGPAL 
QSFALRGVPL 
YATNARGDWG 
GQFALHGTAL 
GAMGMRGPGL 
GQFALHG7UVL 
GGIIQRGPGL 



MLIEATAVQP 
LMVEATAVEP 
VFIEATAVQP 
^&tIBATS vs e 
TIVESTAVse 
IILESIFVSB 
TIFEATGVLP 
LILTGMVQVD 
TIVEATSVTP 
VMVEATAVSP 
SMVEyVTAVEA 
SMVEATAVQN 



E-GRITPQDV ■ 
E-GRITPQDL - 
N-GRISPNDS • 
B-GRXTFQDV - 
E-GGLSPHDL ■ 
N-SGLSIHDL - 
N-GRITPECS > 
HAHKGDAHDI ■ 
N-6RISPEDS • 
E-GRISPNDS ' 
R-GRISPEDV ■ 
H-GRITPQDV ' 



-GLWK — DS 
-GIAJK— DS 
-GLWQ — DG 
-GLtVK — DS 
-GIWK — DE 
-GLWN — DP 
-GLMQ — DS 

-SPNH— pe 

"GUfQ — DS 
-GLWPTMES 
-GLWQ~DS 
-GLWB — DG 



YSA 

YSA 

YSA 



KDGVM TP — WHKQHb GSEAARGPGL IVTEVNAVSP B-GRISPEDA ■ 

KDGYA TD — WHLTHI* GGIIQRGPGL SMVEATAVQN H-GRMPQDV ■ 

PDGHY TM — WHHTHM GGIIQRGPGL TCVEATAVTP Q-GRlTPEDV > 



QIAPMR 


RVI- 


-DFVHSQ 


GQ- 


-KIGV-^ 


-Q 


QIEPLS 


WI- 


-EFVHSQ 


NQ-LIGV- 


-Q 


TTSEQFLGLK 


RW- 


-EF^tiiAQ 


GA- 


-KVGI- 


-Q 


grAPMK 


Rvr- 


■DEVHSQ 


SQ- 


-KIGV- 


-Q 


_ QAEKLK 


PIV- 


•DYAHSQ 


KQ- 


-LIAI- 


-Q 


QAHSLR 


KIV- 


-DFIHDQ 


DG-iccr- 


-Q 


QIAPLK 


RIV- 


-DYIHSQ 


GQ- 


-KAGI- 


-Q 


TTPEQTVTAF 


kawadarrln 


GQSKTPWVQ 


QIAPLR 


RIV- 


■DYVHSQ 


GQ- 


-KIAI- 


-Q 


QMKPLK 


RIV- 


-BFAHSQ 


m- 


-Kiel- 


-Q 


QIAPLK 


RIV- 


-DF](HSQ 


MQ- 


-VAAI— Q 


QIBPLK 


RIT-TFAHSQ 


SQ- 


-KIGI — Q 










QW3PLR 


DIV- 


■DFVHSQ 


GA- 


-KlAI— Q 


~ — QIBPLK 


RIT-TFAHSQ 


SQ- 


■KIGI— Q 


QIEPLA 


KW-EFAHSQ 


NQ- 


-KIMI- 


-Q 



QU^- -^TF DEAtJp-SKRG IPTEQLVQLY RRWGQGEWGQ IQTGNVMlDP EHLEAPGNMV 

YSA • RDGFQ QP — VfHFAHY GGLAQRGPGL IMLEATAVQA R-GRITPEDS 

YSA NNGLP TP — YHIAHL GSFALHGVGN VMVEASGVEP E-GRITPQDL 

YMA. BDGLI ND~WHQVHY ASMRRGGRGL LWBATAVAP B-GRITPGCA 

YSA APBGPSAGVP GD — WHFAHY GARAVGGTGL IWEATGVSP B-GRISPQDL 

YSA TDGVA NE — FHLVHL GQYALGGAGL ILAEATAVSP E-GRITFEOL 

VSA— - - — BDGAP TD — FHLVHF GSRALGGAGL LYTEMTCVsP D-ARI1PGCA 

piyV DGQG VPLPFVQEYV GQRASVPGTL LITEATDITP K-AMGYKHVP 

NRGVpLNPTS TPEQPMRIWY PG-DLMVQYY RQRAT-PGGL IISEGVPPSL E-SNGMPGVP 

PflA AKNHT PS-DLQLEYY KTHSQYPGTL IITEATFTSE Q-GGMDLHVP 

FRS DDE-HV PIVPLMTTYV SQRASVPGTL LVTEATFISP A-AGGYDNVP 

MRA LKPGNI PNRDWAVBYY TQBAQBPGTM IITEGAFISP Q-AGGYDNAP 

MRA -^HPGUI PNRDMAVEYY AQBAQBPGTL IITEGTFPSP Q-SGGYDMAP 

MRA TKPGNI PNKBWAAVYY GQRAQBPGTM IITEGTFISP Q-AGGYDNAP 

ejUK 5K0-HI PS-DLQLNYY NARSQYPGTL IITEATFASB R-GGIDLHVP 



-VPRD-AEP -SGBRFDMFS KLAAAAKEHG SLIV-A Q 

-G1ML~DS HVEGLR KKV-EFAHAM MS-LIGI— Q 

-GIWS — EQ HRDAHK ALV-SVLKSF TD-GLGVGLQ 



— GIWS — DA HAOAPV PW-QAlK/yX GS-VPGI — Q 

-GL^nvI— DT QVEAFR RIT-GFLRSQ GT-VPAV— Q 

-GLWD — OR QIVPLG HIT-DFVKaH GG-HIGV — Q 

-CmK — PS HVNAWK RIV-DFVHGN SDAKIGM — Q 

-GIWS — BP QRBAWR BIV-SRVHSK KC-FIFC — Q 

-GLMT — PE QAAQWK RW-OAVHEQ GG-YIYC — Q 

-GIYN— DA QTKAWK KIN-DEIHMI GS-FSSM — Q 

.-GIYN— AA QlAAWK KIT-DAVKAK GS-FIFC— Q 

-GWIS— EE QMVEWT KIF-NAIHBK KS-FVWV— Q 

-GIWS — EE QIKEWT KIF-KAIHEN KS-FAWV — Q 

-GIWS — DE QVAEWK NIF-LAIHDC QS-FAWV — Q 

-GIYN— DA QAKSWK KIN-EAIHGN GS-FSSV— Q 



wo 2005/080588 



2/17 



Z51 261 271 281 



SEQ 3 
S&Q 6 
SEQ 8 
SEQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SEQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SSQ 42 
SEQ 44 
SEQ 83 
SEQ 85 
Bacteria 
T44612 
MP €25402 
NP"2SSS13 
AF3202S4 
OYE family 
A£4875 
Af4961 
Ca2460 
NC4452 
ScOYEl 
ScOyB2 
SCOYS3 
Aa6990 



LRHAGRKATT VAPW • ISFS AXATEKtfGGW 

lAKAGRKRST VAPW — LSAN DTASEKMtSGW 

LAHAGRKASA VAPW -LWQAGKSS LKADESVGGW 

lAHAGRKASN lAPW — U^HKG IVATEKVGGSf 

LGHGGRKASG QPLF LHLE QVADKSVNGF 

LWHAGRKIVE GVPF — — QQIQHGW 

liAHAGRKMT KAPW HYQRGKS ELASPEQGGW 

INHPGRQSPM GAGT ■ ^RGIM 

LAKAGRKAST KAPWUDSFTP SGBYKPREGh QWGPBYGGf! 

LAHACRKAST TAPV • RG-Y TVATBAQGGW 

LAHAGRKAST lAPW ITEARGK ALAQESENGW 

LSHAGRKASC VSPW- — LSVN AVAAEEVGGW 



PDrVKGPGDI 
PGRVKGPTNV 
PADWGPSGG 
PDRVIGPSW 
ADHAVAPSAL 
QBHCVGPSTE 
PENVWAPSAZ 
E-KAVRPSPV 
PDDVWAPSAI 
EHDVYGPPTN 
PDDWAPSAI 
PDNIVAPSAl 



P -rAEPFAKP KA— 

p FTVKNPVp KE— 

E EHIF SPBEDAYWVP RA-- 

p — E-HETFPTP KA— 



. MTLDEIB QEKK-DWVAA 

-MTKQDIE DLKT-AWVAA 

• LSTAEVR QWA-AFAKS 

■ MTKDDIE QfKR-Ot/r£SA 



-FRPNGNLP VPME LTKDEIK RWK-DFGAA 

. — FSDSHNTP RE LTVNEXW SIVB-OFAMA 

YNSETFPFP KE MTVEQIH ELVB-AffKAS 

LVLGEAFVe RLLSKVt»FGT PRELTVAEIK DIV-QKEAVT 

. eSBDEPNP KB MTVBEIE GLVT-SFVDA 

DRWDBNHAQP HK LTEKQYD ELVD-KFWA 

YTKDWATP RE ' LTTE.SR VWVK-KFAES 

QENGVWPVP KA ETKEDIE QUKS-DWEA 



IGHAGRKftST WPW- 
LSHAGRKASC VSPW— 
lAHAGRKAST VAPW— 



-LDRK NTAF7- 



-LSIN AVAAKEVGGW PDNIVAPSAI A -QEAGVtJPVp KA — — FTKEDIB BLKN-DFLAA 

.^w„LSGG DVAGEDVNGW PQDVWAPSAI P — WNEKHAVP KE — MSLDDIS AFKK-AFGEA 

— — — i,ps KRAGKEAGGW PEDWGPSGG BDFTWDSRSS 5DPSG6YYAP RE LSVREIK EMVQ-DWATA 

VGHPGRQARG SVQ QHPISASD VQLKQEM FGSKFGVp RP ^ATKEDIK AVIE-GFAHT 

IGHAGRKASC VAPW LDAG LAAEKAAGGW PDDWGPSNE P — FAPGVPTP RA ITLEEIE QLKE-DFVSG 

LAHAGRKASD WSPK* YRGEKKQ KFVTQEEGGW PDRWAPSAI A VAQGHVTP RA LTTEDII^ KLQD-KEVQS 



LWSLGWASFP DVLA— 
LWITLGRVANA KDLK- 



— .-. — BGDD HIGADDARGW — ETIAPSAI A fGAHLPW PRA ^MTLDDXA RVKQ-DFVDR 

— . APVGROAVGW — QPLAPSAL A FDBRHPVp TE ^— X.TVPQXQ EAVG-READA 

RGK (3AVPABLGGW — QVIGPDEN S FHDLPPTP AM MGADEIA GWD-AFSAA 

EG IDBPLSAGAW — BLISASPL P YI.PHSQVP RA MTRDDME RVRN-DFVRA 

LP RA LTEDEIQ QCIA-DFAQA 

-VGST EPVRYADHPP IE LTXP-HL KQTIRDYCNA 

KLAKSVGNEL RE LTEKBID HIVBVEYPNA 

— VP BE HTVAEIK BRVA-BYAAA 

AiCRKKAMNBQ HS— LTKDEIK QYIK-BYVQA 

EKAKKANNPQ HS ITKDBXK QYVK-BYVQA 

EKAKDANNLE HS > LTKDDIK QYXK-DYXHA 

KIiAKEAGNEl. RA LTBEEID HIVEVBYPNA 



XAHAGRKASA NRPW— 
LAWVfiRKftST AQPW— 
IA«AGHKAJ5T YAPW- 
LGHAGRKGAT KLAW— 



LWATGRAADP DVLA DMK — D LISSS-AVPV EEKGP— - 

LWHAGRATIP QMTG SpAVSAS ATVWDSPTEC YSHPP 

LWYX.GRVANP KDhK-" DAGLPli XGPSA— VY^* OBESE 

IWSLGRAANP EVLA— KEGGLK LKSSS-AVPM BEGAP 

LBVLGWAAFP DNLA — — RDG-LR "XDSASDNVEM OREQB- 

LWVLGWAAFP DTIA — RDG-IiR YDSASDNVYM NABQB- 



-rdg~lr ydcasdrvym natlq— 
-dsg-IjP liaps-avyw dense— 



301 311 321 331 341 3S1 361 371 381 391 

SEQ 3 TKRAIAA-GA DFVBIHNAHG YLLSSFLSP- -AANNRTDQY G-GSFENRIR I.SI,BIAQLTR DAVGPHVP 'VPLR ISAS-CMCE ETLPBQ 

SEQ 6 VKRAVKA-GA DFIBIHNAHG YLLMSFLSP- -AVNTRTDEY G-GSFENRIR LSLBIAKLTR ENVPKDMP VFLR VSAT-DMLE BVQPNKP 

SEQ 8 ARIAVQA-GV DVIEIHGMG YLINBFLSP- -VTNKRTDAY G-GSFENRTR IVREVAAAIR AVIPBGMP LFLR JSAT-EWLE- -GQPVAAESG 

SEQ 10 CKitAIWV-GA DFIEIHNAHG YI.I.S3FLSP- ~SSNTRTDEY G-GSFENRIR LSI.BIAQVTR DAVGPNVP — VFLR VSAT-DWIB- ETLPEE 

SEQ 12 ARRAVEISGF DAVBIHGAHG YLXNEFYSP- -XSNKRTDEY G-GSFBNRTR PLKEVIDSVK SSIPNDVP VFLR ISAA-ENSP DPE 

SEQ 14 AWBAVEISKF DAIEIHCANG CLXHQFLSK LTWKRADQY G-GSFBNRVR FLLQIIENIK RKIET--P IFLK PPMS-DNCS DPS 

SEQ 16 AQRALKA-GF DLXEIHAAHG YLXSEPLSP ISNQRTDQY G-GSFBNRTB VLREIISAVR SV2PEDMP— LFVR VSAT-EWME YTGQP- 

SEQ 19 AR1TAEA.-GF NGVSIKARHG YLLAQFLSK- -KTNRRGOEY G-GSABWRAfl IVGBIIKECR RQVTEAVGBB EAKKFWGIK LNSA-DWQA GRDGKBBBB 

SEQ 22 AKBAIEA-GV DIIEIHGAHG YLITEFLSP LSNKRTDKY G-GSFENRTR VLIDIIKAVR AVIPEEM PLFVR ISAT-BWME YRGBP 

SBQ 24 AKBAVBI-GF DVIEIHGAHG YLISSTVSPA FTTNDBNDKY G-GTFEKRIIi PPMBWHSVR KAIPDSMF -LPYR VTAT-DWLP- W3Q — 

SEQ 27 AKRSNRA-GF DVIEIHAA — 

SEQ 30 AKRAIHA-GF DVIEIHAAHG YLLHQFLSP VSMQRTDEY 

3BQ 33 TDEY G-GSFENRIR WLEILDLIR AAlPETTP— VLVR VSAT-DWFBP DSQFKDBFPB 

SBQ 38 .KRA.RA-GF DVIEIHAAHG Y.LHQFLSP- -VSNQRTDBY G-GSFENRIR WLBXI 

SBQ 40 VKBALKA-GF DVIEIHMAHG YLLHBFICL- -BATPSPTST G-GSWENRTR LTOESRRPCP QK7 ' 

SEQ 42 AKBAVKA-GV DVIEIHGAHG YLIHEFLSP- ^ITNRRTDSY G-GSFBNRTR LLIEIVTAVR AAMPSSMP LFLR I,SST-EWME OTDrGfOCFG 

SBQ 44 ABYLEKA-GF DGIELKAAHG YLLAQFLSE- »-TTNQBTDEY G-GSLENRMR LXLEVTAEVR RRTSKNF ILGIK XNSV-EFQE KG 

SEQ 83 VRBAVEA-GF DTIDFHFAHG YLVSSFLSP- -ATNKRTDKY G-GSFENRVR LALBIVEAAR AVMPEDMP LFTR ISGT-DWLE NNPEYEGE 

SEQ 85 ARWAFEA-GY DYVELHSAHG YLtlHSFLSP- -LTNQRTDEY G-GSLENRAR FLLNVARRIR QEFPNKG LVtVR V53T-DWAD QAHQRD 

BacCeria ' ' — - — ~ 

T44612 ARRARDA-GF BWIELHFAHG YLGQSFFSB- -HSNKRTDAY G-GSFDNRSR FLLETLftAVR EVWPENLP — — LTAR FGVL-BYDG- RO 

MP 62^402 ARRALAA-GF EIAEIHGAHG YUHEFL5P HSNQRTOAY G-GSYANRTR FALBWDAVR EVWPDDKP LFFR VSAT-DWLE EG 

Mp'~29S913 ARBAQVA-GF DAVBVHAAHG YLLHQPLSP- -LANTRTDDV S-GSFBNRTR LLLBWRAVR KVWPAHLP — LFVR LSAT-DMAE G 

AF3202S4 TRMAAEA-GF DILELHCAHG YLLSSFLSP- -LTNRRTDBF G-GDLENRAR FPLEVFKAMR AMWPTNRP MSVR LSCH-DMFP 6 

Af 4875^^^ ARtJAlNA-GF DGVEIKGANG YI^XDQFTQK- -SCNHRQDRW G-GSIENRAR FAVEVTRAVI EAVCSADR 'VGVK LSPY-SQVL GHGTt-fD 

Af4961 AKTAMEI-GF DGVELHAGWG YLPEQFLSS- -NVNKRTDEY G-GSPEKRCR FVLELMDELA ATVGEDN — ~ — —LAIR LSPF-GLFN- QARG 

Ca2460 AKRAIEA-GF DYXEVHSAPG YFLDQFLNP- -ASMKRTDKY G-GSIEMRAR LLLRIIDKLI GIVGABK lAVR LAPM-SSFL GMEIEG 

NC44S2 AKNAVEA-GF DGVEIM(5ANG YMDQFLQD- -TCNQRTDBY 6-GSIENRSR FAHBWKAW EAVGAEK TGIR LSPY-STFQ GMKMKK 

ScOYEl AKNSIAA-QR. DGWEIHSRMO YUiMQSLDP- -HSKTRTOBY G-GSIE»mAR FTLEWDALV EAIGKEK 'VGLR LSPY-GVFN SMSGGA 

SCOYE2 AKNSXAA-GA DGVEIHSflWG YLLWQFI.DP- -HSNNBTDEY O-GSTBHRM. FTLEWDAW DAIGPEK— - — - — VBLR 2^PY-SV£W SMSGGA 

SC0YB3 AKNSIAA-GA DGVEIHSANG YLLNQFLDP- -HSHKRTDBY G-GTIENRAR FTLBWDRII ETIGPBR -VGLR LSPY-GTFN SMSGGA 

A36990 AKKALEA-GP DYVBIHGAHG VLLDQFLML- -ASNKRTDKY GC6SISNRAR LLLRWDKIiI BWGAMR -LALR LSPW-ASFQ- GMEIEG 
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S&Q 3 
SBQ 6 
SEQ B 
5BQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SBQ 24 
SEQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SBQ 42 
SBQ 44 
SEQ 83 
SEQ 85 
Bacteria 
T44612 

WPJ293913 
AF3202S4 

OtZ family 

Af4a75 

Af4961 

Ca2460 

Nc445Z 

ScOVEl 



SCOVS3 
A3 6990 



SEQ 3 
SEQ € 
SBQ S 
SBQ 10 
SEQ 12 
SEQ 14 
SEQ 16 
SEQ 19 
SEQ 22 
SBQ 24 
SBQ 27 
SEQ 30 
SEQ 33 
SEQ 35 
SEQ 38 
SEQ 40 
SEQ 42 
SBQ 44 
SEQ B3 
SEQ 85 
Bacteria 
T44612 
NP_62S402 
MP"*29S913 
AF320254 
OVE fiunily 
Af4875 
A£4961 
Ca2460 
NC44S2 

scoysi 

3COVE2 
SCQVE3 
A36990 



SWRGVDTVR- 
SWDM-QSSL- 
SWKI.SDSVR- 
AriTIEDSKK- 
AWSTEDALK- 
SWDt.QQTI — 
TDTAEEVLK- 
SWDLEQSTQ- 
GHEIEDTVAF 



FRQELVK- 
-FAKILA- 
ELVKKLP- 
FAEALAA- 
-LADILV- 
-LADLVI- 
ELAKILP- 
-QXELFE- 
-LAKLLP- 
TLAARLR- 



-Q GAVOLIDtSa 
•BT GYVDVLDVSS 
-E WGID1,VDVSS 
-Q GAIDLIDVSS 
-S KGIALVDVSS 
-D LGVKVIDVTS 
-D LGVDLLDVSS 
~Q WGIDEVEVSG 
-D LGVDLLOVSS 
-D GGVDLIDVSS 



431 

GGVLAQQ 

GGTHSEO 

AANHKOQ 

GGVHAAQ 

GGNDYROPP- 
GGNVAHCK3- 

GGJ-INKDQ 

G3YEDPQMAN 

GGNSVAQ 

GGNHKOQ 



441 451 461 471 A81 

. 10 

ftUUMW ff8 ■ — 

KI KSGPAPQVPF AVAVKKAVGD KLLVAAV (SAIT— 



HX HAKPGFQAPF AIAVKMAVGD KLAVASV GMIA- 

KI NLHTAYQTDL 

-KI KSGPAFQAPF 

RSarsK SLREPIKVPL 

RYLUtO DKQLPSQVeii 

KI NVHTYYQIDM 



GEKCEKSBRT MAREAFFI<BF 

KI Bl.TeYYQIDL 

RI EVKDCYQVPF 



AGQIRQAI— HAAGAST LVGAVGLITD SEQARGJ.VQG 

AVAJKKAVGD KLLVATV GTIT 

SRAIKQHVGD KLLVSCV GGLE 

ARKLKSHIRN RCI.IACS GGLD 

AEQIRAAVHE AGKQLIiVGAV GIiVT SA BIAKBTVQBK 

AKIIRTK FPKLPIMVT GGFR ■ 

AAKIREAVGO RLLIGAV GNIN - 

AEKIKDQVNG ILtGAV GMIR ' 



SWTVEarC— QLRRU,P — K HGVDLVDVSS GGIHPKS— AIAI KSGPAYQVDL AKQVKKAVGD SVLVSAV GGIK 



SWDVESTIK- -ISKIIA — D Z.eVOLLOVSS eGNUPQQ " KI NMFNT ■ 

FKP-EEAVQ- WZBALBAAGM 'DFVETSG GTYESEC FAMRKESS flfCREMYFIEF ASVIRKAVKH HWYTTG GFKT— 

TWTLBQSIK LAHQLA— D RGVDVLDVSS GBIHKMQ -KV AACPGYQAPL AKAIKKSVGD KMLISTV GSIK 

SVJTVD^yrVE- LAKMLQE ARVDLLDVSS GGLVPFQ KI TVGAGYQLFG AKAVRDALAK — lEPDASKR MLVGA— 



EQTLEESI — ELARRFK — A GGLDLLSVSV GETIPET — • 

GWTPDDTVR TARDLB — A HGIDIADVST GGNVPRV 

GMOIjEQTVQ- -LBKULK— 'Y EGVDVIiDtSS GQI»TAAQ— — 
GNTADDAVA- -lARLFK— E AGADIIDCSS GQVWKGO 



■ — MI PWGPAFMGPI AERVRREAKL — 
■ — RI PTGPGYQVPF AARVKAGST- — 

QI EVGPGYQVPF AAAVSRAETE — 

• — QP VYGRKYQTPF ADRIRNEVGI — 



PVTSAW GFGT- 

LPVAAV GLlT— 

ISVMAV GLIE— 

• — PTLAVG AISE- 



BL— VPQFBY LIA QM BBLDVAYLHl/ ANSRWL DZ BKPHPDPNHE VrVRVWG-Q SS-PILLA GGYD ■ 

EQR-VETWTF LCESLKKAHB NLS'fVS? lEPRYE ■ QIFSYBEKD NFLRSWG LSDVDLSSFR KIFGTTPFFS « 

EE IHSY ILQQLQQRAD NGQQLAYVSL lEPRVlG IFDA3L EDQKGRSNEF AYKYWKG NFVRA GNYT — — — ■ 

OLIP QFED VIRKIN— -GFGLAYLHI* TQSRVAGN — MOVQP BEDEE-NLAF AAKLWDG PLLIA GGI.T — ■ 

BTGXVAQYAY VAGELEKRAK AQKRLAFVHL VEPRVTNP FtTEGB GBYBGGSNDF VYSIWKG PVIRA GMFA- 

ETGIVAQYAY VLGBLERRAK AGKRLAFVHI. VBPRVTMP- FLTOGB GSY«GSSNKF AY5ZWK5 

EPGIIAQYSY VLGBLEKBAK AGKRLAFVHI. VBPRVTDP SLVEGB GEYSB(?rMDF AYSIWKG 

BE IHSY ILQQLQQRAD NGQQLAYISL VEPRVTG lYDVSL KDQQGRSNEF AYKIWKG 



PIIRA GNFA— 

PIIRA GNYA 

, NFIRA GNYT 



501 



Sll 



521 



531 



5ql 



551 



S61 



S71 



— NGKQ— AN QILBEQP 

SAlilANS LLBKDG 

ADEATAAEAM LSGPBPK 

—NGKQ — AN KLIiBEEtS — - — — — — 

KDPELLN KYLEEGT — 

RDIFKLD EFIJVNGD ' 

E-DGRVTIQR ENGAKTR " 

TRQGMB AALESDD 

— TADl — AB DWOEQGAEK VAEAKffEKOT 
— D6LFTTAM EILBSGK 



IDVALVG RGFQKDPGLA 

LDLVLVG RGFQKNPGLV 

ADAIHA RQFLRBPEWV 

LDVAI^VG RGPQKDPGtA 

— -FDLAI.IG RGFLRNPGLV 

^ -FDIALIG KGFLKNTGLI 

ADMVLVA RQFLKEPEFV 

CDM1GI& RPAXINPSLP 

IBWSBSHGG KTKADLVLIA RQFLRBPEFV 
ADVTFVA REFLRNPSLV 



WTFAQHLGV- BISMAN QIRWGFTRRG 

WAWADELNV -EISMJ«» QIRWGFSRRG 

FSTARKLGV PVTVPV QFGRAI 



WTFAQHLDV- " 
WEFADKLGV- ■ 
SRlAbQLQA- • 
LTVAOELGV- • 
A>fl.ILNPEV- • 
LRTAHNLGV- ■ 
LDSANQLGB' 



-EIAMA3 QIRWGFTRRG 

RLHQAIi QLGWGFWPNK 

QFBTAP QYKUVLS 

^» — DVKAPV QYLRGPLSSR 

PCAfiAR XiFDKKAABPH 

WQWPH QYHRAVWRKG 

NVAWPV QYDYAVKCHR 



— TGHL — AB EVLQSG 



• lOrVRAG RWEVaQMPGLV RAFAWEr.GV- ' 



-SVWMAW QIOWSFKGRG 



— VGAM-VOA LQGVDG — — 

— IGTL— AS 8IIAGG 

— VGMM — EG SYDSPNG 



ZGia RAAGSBPDIiA KDIIAGKVsS IIKYAMGEDE FVLQLTACSA QIRLMAKGEE 

— BD DTPLDLVASG RLFQKWTGLV WSWADDLNT SIQIAH QIAWGFGGRR 

QDRSQIG KLAEQSIQSG ECDAVLLAR GLMSYPS WTBDASVALM GTRAAGNPQY 



-PQLAE • ARLQMiQ ■ 

-EPG QAE KILANGE ■ 

— TGA— QAE AIKQRBD ' 

^AD~»AN SIIAAGR 

AASAEKVTEQ MAAATYT • • 

AGGWOQSNSVI GVLEEGR ■ ■ 

-YPlAPEFKTli LHDLDND 

-PETAK-HLV DREFPBK 

"LHP SW flJSEVKrX — - 

-LHP EW REEVKDP ■ 

-LHP BW REQVKDP 

-YDAPBFKTL INDLKTlD 



iDZ,VSVS RAHLADPtiifA YFfiAHEZGV" — 

ADAVLLG RBLLRNPSWA QHAARELGV 

— ^ ADLIALG RPFLRDPHWA QBAARELGL 

ADLCRIA RPHLADPAHT LHBAAKIGP- — 

, NVAIAFG RYFISTPDLP FRVMAGIQL 

YDALLYG RYFTSNPDLV ERLRKGIPF 

-RTIVGFA RFFTSNPDLV EKLKLGKPL 

DWATFG RHFISTPDLP FRIKBGIBL 

RTLIGYG RFFISNPDLV DRLEKGLPL 

—RTLIGYG RFFISNPDLV DRLEKGLPL- 

RTLIGYG RFFISNPDLV YRLBBGLPL- 

'-RSIIGFS RFFTSNPDLV EKLKLGKPL 



. «^ — EKASWr XPAPXAHtflB 
. ^„ — DARMPD QYG\*GM — — 

, RPVSID QYARAGW— 

. , GEVAWP \CQYRSARGQY 



-QKYDBA 
-TPYDRS 
— NHYDRE 
— MPYDRD 
— NKYDRD 
-NKYDRO 
—NKYDRS 
•-.NVYNRE 



SPYSTLSREG 
RFYGPFEDMA 
EFYKYYNY-G 
TFYKAKSPDG 
TFYQM5AH-G 
TPYKMSAE-G 
TFYTMSAE-G 
BPYKYYNY-G 
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SEQ 3 GTPtlDPSV^ KQSIEDV > 

SEQ 6 AGPXLRKKI.B KZ 

SEQ B 

SEQ 10 GTPVIOPKAX KESIFE " 

SBQ 12 QC2IVDLIERT SKLEVN — 

SEQ 14 

SEQ IS PKKIiTTVP 

SEQ 19 WIVEKLGJ-IKS IVGAGVEVTW YVSELKKLAK F 

SEQ 22 ARI 

SEQ 30 — ■*"- 

SEQ 33 KKVNKSSL 

SEQ 38 > — ' 

SEQ 10 ■ 

SEQ 44 PFDISNADBV ARVTQLMAEG KV 

3EQ 83 KKNAPKLVL — 

SEQ as HRVHVAKK ■ 

T44612 RVR 

WP_€2540Z — 

NP_2 95913 

AF320254 BTNLQKAAAA VAGK — 

OYE family 

Af4e75 YLOVPFSAEY MALHWFPV 

Af4961 KCYVDYPPAT ASS 

Ca2460 VNSVDESEKQ VIGKPLV 

Mc4452 YIDQPFSKEF EKVYC3AQA — 

SCOYEI YIDYPTVEEA LKLGWDKK — 

SCOYE2 YI£>YPTYEEA LKLGWDKN • 

SC0YE3 YTDYPTYEEA VDLCWJJKN 

A36990 YNSYDESEKQ VIGKPLA 

Figure 1. A multiple alignment of the 2031 OR amino acid sequence 
from A. fumig^tus (SEQ ID No3) along with, related 2031 ORs from 
other fungi and bacteria (see Example 4) and OYEs . Regions 1-11, 
marked with * or #, refer to amino acids conserved between ORs 
but not OYEs . 

Fungal 2031 ORs are given by the following SEQ ID No, : A. 
fumxgatusr SEQ ID Nos . 3, 6 and 8; A- nidulans, SEQ ID No. 10; C- 
albicans SEQ ID Nos. 12 and 14; crassa, SEQ ID Nos. 16 and 19; 
M. grisea. SEQ ID Nos. 22 and 44; pomhe SEQ ID Mo. 24 
(NP_595868); C- trifolii SEQ ID No. 27; F. sporotrlchioides SEQ 
ID Nos. 30, 33 and 35; F. gramineartiin SEQ ID Nos. 3 8 and 83; M. 
graminlcola SEQ ID Nos. 40 and 42; U. maydis SEQ ID No 85. 

Bacterial ORs resembling 2031 are: 

T44612 {Fseudomonas put Ida) , SEQ ID No. 86/ NP__625402 
{Streptomyces coelicolor) , SEQ ID No. 87; NP_295913 {Deinococcus 
radlodurans) , SEQ ID No. 88; AF320254 {Azoarcus e^ansll, SEQ ID 
No, 89. 

Fungal ORs similar to the Old Yellow Enzyme family (originally 
identified in 5. cerev'lsiae) : 

A. fumlgatus, Af4875 and Af4961, SEQ ID Nos. 90 and 91 
respectively; C. albicans, Ca2460 and A36990, SEQ ID Nos. 92 and 
93 respectively; N. crassa, Nc4452, SEQ ID No. 94; 5. cerevxslae^ 
OYEl, 0YE2 and 0YE3, SEQ ID Nos. 95-97 respectively. 

Details of the sequence searches that identified the ORs other 
than SEQ ID No. 3, and methods for the construction of multiple 
alignments are given in Example 4 hereinafter. 
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1 il 21 31 41 51 61 71 81 91 

SEQ 1 GiiccIcGic liiGC^Icai iic^CCC^' G^^ACGC CMGi^GCCG AGC^TCGCC ^CGATATGCC ^CGAATTTGC fCCATTCGGC ATCCAGTTTC 

SEQ 84 ' 

101 111 121 131 141 151 161 III III 



SEQ 1 CAGTGCCCTT 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ IB 
SEQ 20 
SEQ 21 



ii^^cw c^™^ =EEE ^r.^'^l '^l'^ ^^^tl ^!!f!!!!^ ™f SmS SctS 



SEQ 23 • ZZI-IIIIII IIIIIII . . CGAAA CCTCGACCCA AACAAACAGC 

SEQ 26 Z"ZrZIZ ZZZZZZZZZZ ZZZZZZZIIZ ZZZIZZZZZI ZZZZZ-ZZZZ ■ GAAc 

ii I ZZZZj;^^ 'ttI^TgT^ l^TGiZill ^GGCGicG Tcil^TTT AT^^ACCT ATACTTGTTT GTTCACTTCT ATGCTACTCA TATCAATCCG 

SEQ 84 



211 



221 231 241 251 



iiii is = =c = 

. ~ " . A TGTCGCAACC 

■ ZZZZZ ZZZZZZZZZZ ZZ— ZZZ A TGGGTTCCAA 

- ~ nmR&rar: TTrrATArCA 



. ;c;;;^ci^ -.^i ---- 

SEQ 2 
SEQ 4 
SEQ S 
SEQ 7 

SEQ 9 ZZZZZZZZZZ ZZZZZZZZZZ ZZ ATGACAG TTCCATACCA 

SEQ 13 ZZZZZZZ ZZZZZZZZZZ ZZZZZZZZZZ ZZZ ^A TGGccGACTT 

25 'i^ciii'cVc 0;;=;^ ;;;;;cc^^ iccic^ci ci;;Gcc;c i^ccc^s: iiz':':'^^ 

SEQ Vc^liZii icG^TAiiA ga^cgS^tI cicTATAT^ E^^!*! ^If?!!!!!^ 

34 I^II^Mc^ I^GicccCT lil^^i ^TCTMtli^ 3°****^!!^ ^t^!^!!!! I^E^!! !!!^!!!- ^^T. 

11 11 ^^^^^ i^cZi'c^ ^^i;;^icc ^iiii 

SEQ 84 ■ 



wo 2005/080588 



PCT/GB2005/000623 



6/17 



^ ********** ********** ********** ♦* — . 

SEQ 1 GACTGTCGCC GATATCGACG TTCCTCCTGC CGAGGGCATC CCCTACTTCA CTCCGGCCCA GAACCCTCCT GCCGGTACGG CAGCTAACCC CCAGACCAAT 

SEQ 2 GACTGTCGCC GATATCGACG TTCCTCCTGC CGAGGGCATC CCCTACTTCA CTCCGGCCCA GAACCCTCCT GCCGGTACGG CAGCTAACCC CCAGACCAAT 

SEQ 4 TGTTGTGCCT GACATCGAGA ACAAACCCGC GCCGGGTATC TCGTACTTTA CTCCGGCGCA AGAGCCGCCT GCTGGCACCG CTGCTAATCC TCAGTCTGAT 

SEQ 5 TGTTGTGCCT GACATCGAGA ACAAACCCGC GCCGGGTATC TCGTACTTTA CTCCGGCGCA AGAiGCCGCCT GCTGGCACCG CTGCTAATCC TCAGTCTGAT 

SEQ 7 CGCCTTCCGG TCCCCCGCCG TCACCAAGTC CTCCTCCACC CCCTACTACA CTCCCGCCAA CAATGGAGGC GCCGCCCTGC ACCCCGACGA CCCCAC- 

SEQ 9 GGCTCTCCCT GACGTCGAAA ACACCCCCGC CGCCGGCATC CCCTACTTTA CACCAGCACA GAACCCTCCT GCTGGAACAG CTGCCAACCC GCAAACCAGC 

SEQ li AGTAAAACCA TCAGATGAAA TCAAAGGTGC TCCTGAGGTT TCCTATTACA CTCCAGAACA GCCTGTTCCG GCTGGTACTT TTTATCCCCA ATCGTC A 

SSQ 1-3 w-—^^-*— — 1-*^**^^^ — p*^— ^ wM«-»Mite— — kita^^ — w*^-^^— ^•"^AXGGAAA ACAACAATAC TATaCCG"""" 

SEO IS CACCCAGAAG AAGACCTCCT CCCCCGCGGC CCCGGGTGTT CCCTTCTACA CCCCGGCCCA GGTCCCCGCC GCCGGCACTC CCCTCCCCTC CACCCCC 

cpn IT ^ ATGGCTACTT CCACTACCTC C6ACCTC 

SEQ 18 nil-I ATGGCTACTT CCACTACCTC CGACCTC 

SEQ 20 GGCAGAAAAG AAGACTTTGA GCAAACGGGC CGCCGGGGTG CCTTACTACA CCCCAGCCCA GGAGCCGCCG GCAGGGACCC CTTTGCAGCA GCAGGACG— 

SSQ 21 GGCAGAAAAG AAGACTTTGA GCAAACGGGC CGCCGGGGTG CCTTACTACA CCCCAGCCCA GGAGCCGCCG GCAGGGACCC CTTTGCAGCA GCAGGACG— 

SEQ 23 ATGAC TATTGTTAAT GAAGGAGCCG AAAATGTTGG TTATTTTACA CCTGCGCAAA AAATACCAGC TGGAGCGGCG ATAGGTGTAC CGCAAA 

SEQ 25 CAGCATGACG GGCACCGCGA ACAAGGCCGC CCCCCGTGTG CCGTTTTACA CCCCGGCCCA GGAGCCTCCC GCGGGAACGC CAGTCGACGC CAGCACGG— 

SEQ 2S ATGACG GGCACCGCGA ACAAGGCCGC CCCCGGTGTG CCGTTTTACA CCCCGGCCCA GGAGCCTCCC GCGGGAACGC CAGTCGACGC CAGCACGG— 

SEQ 28 GGCTTACGAG ATAATCGACA ACGTTGCGGC TGAAGGGGTT CCATATTACA CACCGGCTCA AGACCCGCCA GCTGGTACGC AGACAAGCGG CTCAACG 

SEQ 29 GGCTTACGAG ATAATCGACA ACGTTGCGGC TGAAGGGGTT CCATATTACA CACCGGCTCA AGACCCGCCA GCTGGTACGC AGACAAGCGG CTCAACG-— 

SEQ 34 CCATCACAAA ATCATCATCA ATAAGGAAGC TCCGAATGTT CCTTTCTATA CTCCAGTGCA AGATCCACCA GCAGGAACGT CTTACGATGT TCAGCCTGAA 

SEQ 3 6 -GCACGAGGG ATTATTGACA ACATCGCGGC TGAAGGGGCT CCCTACTACA CGCCTGCTCA AGACYCTCCA GCAGGCACAC AGACCAGCGG CTCAACCA— 

SEQ 37 -GCACGAGGG ATTATTGACA ACATCGCGGC TGAAGGGGCT CCCTACTACA CGCCTGCTCA AGACYCTCCA GCAGGCACAC AGACCAGCGG C"^^^^^^" 

SEQ 39 ' 2 "IIIIIII ZIIIIIIZII -IIIIIII-Z ""IIIIII 

SEQ 4 3 '"IIIIIIII IIIIIIIIII IIIIIIIIII III ATGT CCCCACCACG CTTCGAAGCG GCCCCTGCCG ACCCCTCACC GCTCGGC 

SEQ 82 AAACAAGGAG GTTGTTCAGA ATGTCGCTGC CAAAGGAGTG CAATACTTCA ACCCTGAGCA ACTTCCTGCA CCAGGTCTCG GTATAA^GG TCCCAAT--- 

SEQ 84 ACCGCCTCTC GTCGACTCGA TCGATGCACT CAAGATCAGC AACTTTGTCC CCACTCGAAG TGGCCACCCT CCTCCTGGCT CGGTCCCGGA ATCCATCCTG 

401 4L1 421 431 441 4S1 461 471 481^ 491 

SEQ 1 GG CC AGAAGATCCC CAAGCTCTTC ACGCCCTTGA CCATCCGTGG CGTCACC TTCCAGAAC CGCCTTGGTG 

SEQ 2 GG CC AGAAGATCCC CAAGCTCTTC ACGCCCTTGA CCATCCGTGG CGTCACC "Z^^ll^nl'r^ 

SEQ 4 GG AT CGGCACCTCC CAAGCTCTTC CGGCCGCTTT CGGTGCGGCG TCTGACC TTTCACAAT CGCATTGGCG 

SEQ 5 GG AT CGGCACCTCC CAAGCTCTTC CGGCCGCTTT CGGTGCGGGG TCTGACC TTTCACAAT CGCATTGGC- 

SEQ 7 -GACCCC TACGCTCTTC CGGCCCTTAC AAATCCGCAA TGTGACG ■ CTCAAGAAC CGCATCATG- 

SEQ 9 GG CA ATGCCGTCCC CAAGCTGTAC ACACCTCTGA CGGTGCGTGG GGTGACC tn^JJnr^" 

SEO 11 GA TG AAGTTGCTCC CAAAATTTTT CAACCTTTAA AGATTGGTAA GCTTGCT-^ TTGCCAAAC AGAATTGGG- 

SEQ 13 GCATTATTT CAACCCATAA AGATCAGTGA CTCGATC AC ATTACCTAAT AGAATTGGT- 

SEQ 15 G GCGATGTCCC TACTCTCTTC ACCCCTCTCA AGATCCGTGG TGTTGAG CTCCAGAAC CGCTTCGCC- 

SEO n . -AAACTCTCC CAACCCCTCA CCCTCCCCAA TGGCCTT— AC CCTCCCCAAC CGCCTCGTC- 

SEQ la AAACTCTCC CAACCCCTCA CCCTCCCCAA TGGCCTT AC CCTCCCCAAC CGCCTCGTC- 

SEQ 20 CCATCCC AACGCTGTTC AAGCCTCTGA AGATCCGTGG CGTCGAG CTCTCCAAC CGCTTTGGC- 

SEQ 21 CCATCCC AACGCTGTTC AAGCCTCTGA AGATCCGTGG CGTCGAG ■ -CTCTCCAAC CGCTTTGGC- 

STCO 23 C AAAATTATTT ACTCCTCTTA AAATTAGAGG AGTGGAG ' TTCCATAAC AGAATGTTT- 

SEQ 25 CTCC GACGCTCTTC AAGCCCCTCC GCATCCGCGA CCTCACC ATCAACAAC CGCATCTGG- 

SEO 26 CTCC GACGCTCTTC AAGCCCCTCC GCATCCGCGA CCTCACC ATCAACAAC CGCATCTGG- 

3EO 28 AAGCTATTC ACACCCATCA CCATCCGCGG CGTCACA TTGCCAAAC CGCCTCTTC- 

SEQ 29 AAGCTATTC ACACCCATCA CCATCCGCGG CGTCACA TTGCCAAAC CGCCTCTTC- 

SEQ 34 GG------- AAGCCTATTC TCTCTTATTA AAATAAGAAA CCTGACT CTTCAAAAC CGGATTTTT- 

SEQ 3 6 AGGTTTTC ACACBCATCA CCATCCGAGG CGTCACA TTGCCAAAC CGTCTCTTT- 

SEO 37 AGGTTTTC ACACBCATCA CCATCCGAGG CGTCACA TTGCCAAAC CGTCTCTTT- 

SEQ 39 CCTCA AGATCCGAGG TCTTACC CTCCAGAAC CGTATTATG- 

SEQ 43 IIIIIIIIII IIIIII^GC CGCTCAAATA CCCCGTCTCG GGGCGGTCG GCGCCCAAC CGGTTCCTC- 

SEQ 82 A ATACTCTACC AAAGGTCTTT ACACCCATCA AGATTCGCCC CATGACC ATGCCCAAC CGTATCTGG- 

SEQ 84 CCAGAGGGTG TCAAAAAACC GGCTTTGTTC CAAACGTTGA CATTGCCCTT TGCTGCftCCG GAACAGGCGG GTAAGATGAC CTTCAAGAAC CGCATCATT- 

501 511 521 531 541 551 561 571 581 591 

SEQ 1 ilAGTCCGTT T6CCCTTGCT GATATCGACG AAAGCTAATC CCCCGTCAG- --------- i:::"":: Hi::::;:: IHHScGC GCCcSctGC 

SEO 4 TGAGTGCAGT CCAGGCAATT ATGCTATCCA TCCTATGCGA GCCCTTGCAT TGGAACAGCC GCTTACAGGG AATGATAATG AGTAGCTATC GCCACTCTGC 

^ ^ „ CTATC GCCACTCTGC 



SEQ S 
SEQ 7 



" « - — — GTGTC GCCCATGTGC 

31 - - — _ — ^-CTCGG GCCGCTCTGC 

?S 7, ~~II Z"II ZIIIIZIZZI IZ_Z - * — . - - — - — — . GTATC TCCAATGTGT 

""~ IZII IIIIIIIIII II-I I - - — ' — • GTTTC ACCAATGTGC 

Z ZZZZZZZ ZZZZ 1 - - — ' GTTGC GCCCATGTGC 

"I ZZZII IIIIIIIIII IIIII-I - - - - — — — ~ — — AAAGC CGCCATGGCC 

- "ZZZZZZ Z_ZZ . • AAA.GC CGCCATGGCC 

-- ~ 2" ZZZZZZZ IIIII-III- - — - - — — GTCTC GCCCATGTGC 

"I ZZIII IIIIIIIIII IIII „ . - - ' — GTCTC GCCCATGTGC 

— 2 " ZIIIIIII-I - — - • — — GTTTC GCCCATGTGC 

- "ZZ I GTCAG CCCCATGTGC 

ZZZI^II - ■ ■ ' GTCAG CCCCATGTGC 

- ^ ^-CTTGC CCCTCTCTGC 

- ^ - CTTGC CCCTCTCTGC 



SEQ 11 
SEQ 13 

SEQ 15 
SEQ 17 
SEQ 19 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 2 6 
SEQ 28 
SEQ 29 
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601 611 621 631 _ 641 651 !!L„__, !!- — 

rllilrTrCG cc"II-~ ZIIIIIII" -"CAGgIcG GCCA^MgI^ CGAcI""- ilckcATCG CCCATCTGGG TGGGATCGCC CAACGCGGAC 

cln I ^llrlrrt cri- CAGGACG GCCACATGAC CGAC TACCACATCG CCCATCTGGG TGGGATCGCC CAACGCGGAC 

SEQ 2 CAATACTCCG CC -iftCGATG GACACATGAC TCCC TGGCATATGG CACATCTTGG AGGGATTGCC CAGCGAGGGC 

Ho 5 S^SScAG CC:::::::: ^cS^G Si^iS TCCC TGGCATATGG CACATCTTGG AGGGATTGCC CAGCGAGGGC 

Hi ? S^S? ^^GAGXCGGA CCCG.CGTCT CCCC^CTCG GCCCCCT^C AAAC------ XACCACCTGG CGCATCT^^^^ CCACCTCG^^ CTC^GGCG 

SEQ 9 CAGTACTCCG CA "^J^^^J? S^SS^Sc TAcStTTAA TCCATTATGG TTCATTAGTG AATCGTGGGC 

SEQ 11 CAA^ATTC-IG CT ^^^^^ SS^SaC TCTG TTTCATTTTG TTCATTATGG ATCATTTGCT GTACGTGGAC 

SEQ 15 AcSaScTG CcI::::::: -^CGA?^ S^Sc TGGCACCTTG TCCACCTGGG CTCCXtCGCC CTCCGCGGTG 

SEQ 15 ACCTACTCIt- >-«T"rrGGCa accaCCTGCC CAAC CCCGAACTCG CCGCCGTCTA CGCCACCTGG GCCCGCGGCG 

SEQ 11 <2^CAAATGG GC Sc?GCC CCcSaCTCG CCGCCGTCTA CGCCACCTGG GCCCGCGGCG 

Ho 20 a^t^^Sg ccI::::::: :::::::: gIcgSg gccacSgS ttccacttgg tgcacctggg ccagtjcgcc ctgcacggca 

SEQ 20 ACCTACTCAOi CC __rarraTr rrrT^rCTGAC CGAC TTCCACTTGG TGCACCTGGG CCAGTTCGCC CTGCACGGCA 

II Sat?ccc S::::::;: i^cT^"^. r™fc a^-— .«:a,cttgo Sc^g 

SEQ 23 ACTTATTCCb Ci ___nArAATr, RCCACGCGAC CGAC TACCACCTCG TCCACCTGGG CCAGTTCGCC CTGCACGGCG 

SEQ 25 CAGTACTCCG CC gCCACGCGAC CGAC TACCACCTCG TCCACCTGGG CCAGTTCGCC CTGCACGGCG 

SEQ 26 CAGTACTCCG CC ^^^^ cTTAicCCAC T^AT 1GGCACTTGA CTCACCTCGG GGGAATAATC CAAAGAGGCC 

Sq 29 C^-5^?CCG CC:::=::= ::::::: ^S^G StSgcSc ^GGCACTTGA CTCACCTCGG GGGAA^AATC CAAAGAGGCC 

II ZZZZZZZTZn nl I IIIIIZIIII I-^AAGGATG GTGTCATGAC CCCC TGGCACAAAC AACACCTGGG CAGCTfCGCA GCACGAGGTC 

SEQ 34 CAATATTCAG CA- t??^ATG GATATGCTAC TGAT TGGCACTTGA CTCATCTCGG AGGCATTATC CAACGAGGCC 

SEQ 3 6 CAATACTCCG CC - ^^^H ™tgcTAC TGAT TGGCACTTGA CTCATCTCGG AGGCATTATC CAACGAGGCC 

II c^Sg S:::::::: :::::::::: :-"™cg Scac^Sac Z^l—^^ ^ggS^cIca cccacatggg cggcatcatc caacgcggtc 

SEO 43 ;;;GGC^;GG Cp^Ccii i^G^GGCG GAC^ AGCGC^^T ||acGGAG '^^^C. ^GCGAGT 

IIq II SSSg cgI::::::- :::::::::: --Sg g™?^ ^cS — ?accacattg cgcatttggg atcgtttgcc ctgcacggtg 

701 711 721 731 741 Ht III !!- 

^_»__««*** *♦♦♦***♦♦♦ — — — — — — 

qpn 1 rrcG^cicAT gctgattgag gcgaccgccg tccagcccga a — ggccgc atcacccctc aggatgtcgg tctgtggaag gactcc CA 

Ho 2 ccggcSgat gc?gattgag gcgaccgccg tccagcccga a— ggccgc atcacccctc aggatgtcgg tctgtggaag gactcc- CA 

Ho 4 caggaSct? gSgS^cgag gcaacagcag tcgaaccgga a— ggcagg atcaccccgc aggacctggg actatggaaa gactcg --CA 

SEQ 5 SgGA^TcS GA^GGTCGAG GCAACAGCAG TCGAACCGGA A— GGCAGG ATCACCCCGC AGGACCTGGG ACTATGGAAA ^ACTCG---- -Z'-^--^^ 

Ho 1 Sggcctcgt cttcatcgaa gccaccgccg tgcagcccaa c— gggcgc atctccccca acgactcggg cctctggcag gacggcacca cctcggaaca 

SEO 9 ccggtctcat gatgatcgag gcaacctccg tctcacctga a— ggcaga atcacgccgc aggacgtcgg tttatggaag gactcg 

h Sib l^^lliT. ==" 

i i S ^^^^t?.^ Sc^c^-^ ==-c ™i 

i i iii is ™c -.t^= giii g^p ~£ 

Ho li cSccc-TGAC cattgtcgag gccacatccg tcacgcccaa c— ggacgc atctcgcccg aggacagcgg cctgtggcaa GACAGC CA 

IS II cTCGrcT?GT Stggtagaa GCGACAGCGG tttccccaga G— GGACGA atttcaccta atgattcagg attatggatg gagtcg — CA 

II CCGCcSgTC CATGGTCGAG GCGACCGCCG TCGAGGCTCG T— GGCCGC ATCTCCCCCG AGGATGTCGG TTTGTGGCAG GACTCG CA 

SEO 26 CCGCCC?g?C SJgGTCGA^ GCCACCGCCG TCGAGGCTCG T--GGCCGC ATCTCCCCCG AGGATGTCGG TTTGTGGCAG GACTCG CA 

SEO It ccggaSgtc CATGGTGGAG GCTACCGCTG tacaaaacca c— ggtcgc atcacacctc aggatgttgg tctgtgggaa gacggc— -CA 

Ho 29 CCGGA??g?C CA?GG?gSg GCTACCGCTG TACAAAACCA ^---GGTCGC ATCACACCTC AGGATGTTGG TCTGTGGGAA ^ACGGC---- 

lln II rrGG^cicAT TCTCACAGAA GTCAACGCAG TTTCACCAGA G GGACGA ATCAGTCCTG AGGATGCAGG CATCTACGAT GATGGG- CA 

SEO 36 CGGgISgTC cSggSSg GCCACCGCTG TTCAAAACCA C~GGTCGC ATCACCCCTC AGGACGTTGG TCTCTGGGAA GATGGA- CA 

Ieo 37 CGGGAC?G?C CMGGTA6AG GCCACCGCTG TTCAAAACCA C— GGTCGC ATCACGCCTC AGGACGTTGG TCTCTGGGAA GATGGA CA 

Ho 39 CCGGAC?cIc CTGcStG^V GCCACAGCCG TGACTCCTCA A---GGTCGC ATCACGCCTG AAGACGTCGG TATCTGGCAA GATTCT- 

fSEn 41 w-. — — — — — — — — — — — — — 71 ^ r-?f-'i->mv/Tr' 



llo 43 GGGGCCAGAT CCACACGGGC AACGTCATGA TCGACCCGGR GCACCTOGAG GCCCCGGGCA ACATGGTGGT GCCGCGCGAC GCCGAGCCCT CGGGCGAGCG 

82 CTKcS^? Stcctmaa gctaccgcag ttcaagcacg t— ggccgi atcacacctg aagattcigg catctggcta GACTCT CA 

It TC^SSS? StGGTCGAA GCAICTGGTG TTGABCCAGR G— GGGAGG ATCACGCCTC AGGACCTGGG TATTTGGTCG GAACRG CA 

801 811 821 S31 ^ 8S1 861 871 881 891 ^ 

SEQ 1 iiicGccccG '.'-'-lilcGcl llliZTcal 'cTiclTicZ ZHZgZ- ^c^okTcl gcgtg""- ----S^E!! =SSS??SE55 

li \ =fcS :::==c =S ^^T^c S^^S^c^ -^J ||^J 

=S SSc^c S^^: g^=?^^ ::"gc S g g^ 
1 Siife -"^^^ ™- fcS??::::: S 

f \\ t?SSc^ ??ACGgX SaTTGTTGA TTTTATTCAT GATCAAGAC- GGAATTTGCT GTATA CAATTG AATCACGCTG ^GCGAAAGAT 

S S diS S= ^ssss^ a^^^ c^^?5^;^o; cG^S ^Sg = 

i i iSi 3iS Sii Siii S=: = lEE Eill iill 

ii I Eiii sii iiiii i=i ^ii ~s si 

V. =cc=5 c::?= gS-oa S?-- S^^ff??^ ??5E::: 

IS « OTCG^MG wiic^C icGCCGKGC CGCC^gIg ^ACGG^IgC- cic-MCGTC GCG CAfiCTC GGACACCCCG GTCGCCAGOT 

82 TC?^S^G^ --CTGCGAA AGCACGTCGA GTTTGCCCAT GCCAACAM- TCTCTTATCG GTATC CMMT SGCCATGCTG OTCGCAAGK 

Sq 84 TCGGGATGCA CACAAGG CGCTGGTGTC GGTGCTCAAG TCCTTCACG- GATGGTCTGG GTGTA- GGGCTG CAACTGGCGC ATGCGGGAAG 
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♦+**+**+ 



^EO 1 CACCACCGTT GCGCCCTGGA TCTCA-^ TTCTCGGCC ATCGCGACGG AGAAGGTCGG CGGATGGCCG 

sIq I S^S^CG?? GCGCC^GGi TCtS - - TTCTCGGCC ATCGCGACGG AGAAGGTCGG CGGATGGCCG 

SBQ 4 CAGCACCGTC GCGCCATGGC TCTCG GCCAACGAT ACCGCCTCCG AGAAGATGGG CGGCTGGCCA 

?En S CAGCACCGTC GCGCCATGGC TCTCG -GCCAACGAT ACCGCCTCCG AGAAGATGGG CGGCTGGCCA 

sIq 1 G^TGCCGTT GCGCCGTGGC TGGCG GCGC AGGCGGGCAA GTCGAGTCTG AAGGCGGATG AGAGCGTTGG CGGGTGGCCC 

5EO 9 TTCGflACATC GCCCCCTGGC TCATG AA CAAGGGCATC GTCGCGACGG AGAAGGTCGG TGGCTGGCCG 

SEQ 11 TTCTGGTCAG CCCTTATTTT TGCAC - - TTGGAACAA GTTGCAGATA AATCTGTCAA TGGGTTTGCC 

SEQ 13 TGTTGAAGGG GTACCATTCC AACAA ATACAACA TGGTTGGUAA 

SEQ 15 C?CCACCAAG GCCCCCTGGC ACTAC - -CAGCGCGG CAAGAGCGAG CTTGCCGGCC CCGAGCAGGG TGGCTGGCCC 

SEQ 17 TCCGATGGGC GCGGGCACGC GGGGA -CTGT GGGAGAAGGC GGTGGCGCCC TCGCCGGTGC CGTTGGTGTT GGGAGAGGCG 

SEQ 18 TCCGATGGGC GCGGGCACGC GGGGA- CTGT GGGAGAAGGC GGTGGCGCCC TCGCCGGTGC CGTTGGTGTT GGGAGAGGCG 

SEQ 20 CAGCACAAAG GCCCCCTGGC ACGACTCCTT CACCCCCAGC GGCGAGTATA AGCCGAGAGA GGGCTTACW3 GTCGTCGGAC CCGAGTATGG CGGCTGGCCT 

SEQ 21 CAGCACAAAG GCCCCCTGGC ACGACTCCTT CACCCCCAGC GGCGAGTATA AGCCGAGAGA GGGCTTACAG GTCGTCGGAC CCGAGTATGG CGGCTGGCCT 

t,!: TAnCACCACT GCTCCTTATC GAGGA — " TACACA GTTGCGACTG AflGCTCAAGG TGGGTGGGAG 

llo ^5 ™cACCc5g gScCGTGGA TcIcC GAGGCTCG CGGCAAGGCG CTGGCTCAGG AGAGCGAGAA CGGCTGGCCC 

SEQ 26 ?aGCACcSg GCACCGTGGA TCACC - GAGGCTCG CGGCAAGGCG CTGGCTCAGG AGAGCGAGAA CGGCTGGCCC 

llo 28 CaStgcSa ?cTCCCTGGC TAAGC GTAAATGCT GTCGCGGCGG AAGAAGTGGG TGGCTGGCCA 

sIq 29 CAG??GcSS ^SccSgGC l^l — — "GTAAATGCT GTCGCGGCGG AAGAAGTGGG TGGCTGGCCA 

SEQ 32 — — ■ — — — — — — 

qpn r>A rarrArACTC GTACCGTGGC TGGAC CGCAAGAAC ACTGCTTTTA 

110 It ??g™g?a ?cTCCGTGGT TGAGC ATCAACGCT GTTGCCGCTA AGGAAGTCGG TGGCTGGCCA 

SEO ll ?agSSg?A ?r?cCGTGGT TGAGC ATCAACGCT GTTGCCGCTA AGGAAGTCGG TGGCTGGCCA 

111 II SgSSSS GcIccATGGT tIagC GGCGGCGAT GTTGCTGGTG AGGACGTCAA CGGATGGCCA 

llo II ^^ll^Jl lltl^ZllZt GACT GCCGAGTAAA CGCGCCGGCA AGGAGGCGGG AGGATGGCCG 

^Jn /L^ rCGCGGCRGC GTCCAGCAGC ACCCC ATTAGCGC CAGCGACGTG CAGCTTAAGC AGGAGATG 

110 tl Scc?gS?t GCTCC?JggT TAGAC — GCCGGACTT GCCGCTGAAA AGGCCGCTGG TGGATGGCCC 

111 11 S^GGcSS SSgSSc OTTC TACC GCGGAGAAAA GAAGCAAAAE TTTGTGACGC AGGAGGAAGG TGGCTGGCCG 



IQOl 



1011 1021 1031 1041 1051 1061 1071 



********** ♦♦**♦***♦* ******** nr,,-, - 

SEO 1 GACCCGCGTC AAAGGGCCCG GCGATATC - CCCTTTGCG GAGCCCTTCG CCAAGCCCAA GGCCATGACG 

llo 2 G^-CGCG?C ^GGCCCG GCGATATC- CCCTTTGCG GAGCCCTTCG CCAAGCCCAA GGCCATGACG 

llo 4 GGC-CGCG?C ^GGCCcS CAAATGTG -CCCTTCACC GTTAAGAACC CTGTGCCGAA GGAGATGACC 

SEO 5 GGC-CGCG?C J^GGCCCGA CAAATGTG CCCTTCACC GTTAAGAACC CTGTGCCGAA GGAGATGACC 

llo 1 GCG-Ga5g?g ^GGTCcS SgScGGG -GAGQAGC ATATCTTTA6 TCCCGAGGAG GATGCGTATT GGGTGCCGCG GGCGCTGAGC 

llo I GA?-cSgTG ATCGGCCCGT CCACCGTG - - CCCTTCCAC GAGACTTTCC CCACCCCCAA GGCCATGACG 

110 11 GAC-^GCA g??gc5cc?t c5gCA?TG- GCATTC AGACCAAAT GGTAATTTAC CTGTTCCTAA TGAGTTGACC 

lln . \ rll GTGGGGCCAT CTACTGAG- - CCATTTAGT GATTCACACA ATACACCACG AGAATTGACT 

II G^-Sccfc fcGGCCCcS GCGCCA^C- -AG CTA.AACGAG GAGACCTTCC C^CCCCAA GGAGATGACC 

111 II lltmi^l S?G? ^G=:: :::::::::: :::::::::: "-c™ S^-^ 

lit In gIt-gIcg5c TGGGCCCCGA GCGCCATC CCGTTCTCG GAGGACTTTC CGAACCCCAA GGAGATGACC 

llo 21 GA?-GACG?C ^GGGCCCCGA GCGCCATC CCGTTCTCG GAGGACTTTC CGAACCCCAA GGAGATGACC 

llo II ^I-IatUt TATGGACcIa ATGAAGAC AGGTGGGAC GAAAACCACG CTCAACCTCA TAAGTTAACT 

lln II ^C-GACgS GTGGCTCCCA GCGCGATT CCTTACACC AAGGACTGGG CCACACCGCG TGAGTT6ACT 

«n II TAC-GACG?? SggScCCA GCGCGATT -~ CCTTACACC AAGGACTGGG CCACACCGCG TGAGTTGACT 

llo It SclS5Sc G??GC?CCCT CGGcSJc- GC ACAAGAAAAT GGTGTGAACC CAGTTCCCAA GGCTTTCACG 

sll 11 ^-^IaTo g??GC?CCC? CGGcS?C GC ACAAGAAAAT GGTGTGAACC CAGTTCCCAA GGCTTTCACG 

SEQ 32 ~ 21 II IIIIZIIIII 

SEQ 36 ^-AACATT GTTGCTCCTT CTGCCATc" "IIIIIIH IIIIIIIIII II GC ACAAGAAGCT GGCGTGAACC CTGTTCCCAA GGCCTTCACC 

llo 3? S^-A^ATT GTTGCTCCTT CTGCCATC GC ACAAGAAGCT GGCGTGAACC CTGTTCCCAA GGCCTTCACC 

llo 39 ^-^G?C ?GGGCGCcS GTGCGATT CCATGGAAC GAGAAGCACG CTGTCCCAAA GGAGATGTCG 

II S^-GATGTT GTGGGTCCGT CGGGTGGGGA GGACTTTACG TGGGATGAGA G_GTCCTCGAG CGACCCTAGT GGAGGCTACT AT^AG f-f^O.^^G 

twn il rlil^ccii OTCGGACCTA GCAACGAG- CCTTTTGCT CCTGGCTACC CTACCCCCCG TGCTATTACT 

IIS I4 S?-S5g?c ScStcSt cS^ATC- - - GCATATGCG CAAGGTCACG TTACCCCTCG AGCTCTCACG 

1101 nil 1121 1131 1141 1151 1161 1171 1181 1191 

SEO 1 CTGGATGaIg ATCGAGCAGT TCAAGAAGGA CTGGGTGGCG GCCACGAAGC GCGCCATCGC CG CCGGT GCGGACTTTG TCGAGATTCA CAATGCGCAT 

IeS 2 C^GGA^gLg A^CGAgSgT ?S^AAGGA CTGGGTGGCG GCCACGAAGC GCGCCATCGC CG— CCGGT GCGGACTTTG TCGAGATTCA CAATGCGCAT 

SEQ 4 AAGCAGGA-T ATCGAGGATC TGAAGACCGC CTGGGTGGCG GCTGTCAAAC GGGCTGTTAA GG CCGGA GCCGACTTTA TCGACATCCA CAATGCGCAT 

SEQ 5 ^GCAGGA-T ATCGAGGATC TGAAGACCGC CTGGGTGGCG GCTGTCAAAC GGGCTGTTAA GG— CCGGA GCCGACTTTA TCGAGATCCA CAATGCGCAT 

Se2 1 ^GGCCGA-G GTCCGTCAGG TGGTGGCGGC GTTTGCGAAG AGCGCGCGGC TAGCGGTGCA GG— CTGGG GTGGATGTTA TCGAGATCCA TGGGGCGCAT 

IeS I ^GGACGA-C ATCGAGCAGT TCAAGCGCGA CTGGTTTGAT GCGTGCAAGC GGGCCATTGC CG— CTGGC GCGGACTTCA TCGAGATCCA CAATGCCCAC 

11 ^GMGA-A A?CAAACGTG TTGTTAAGGA TTTTGGTGCT GCTGCTAGAA GAGCTGTTGA AATCAGTGGC TTTGATGCAG TTGAGATTCA TGGTGCTCAT 

seS ^Stga-a ataaattcaa ttgtggaaga ctttgccaat gcagcttggc gggctgtgga aatctcaaaa ttcgatgcca ttgaaataca ttgtgctaat 

is G?C^CA-G A^^CGAGC ^CGtCGAGGC CTGGAAGGCG TCTGCCCAGC GTGCCCTCAA GG— CCGGC TTCGACCTCA TTGAGATCCA CGCCGCCCAC 

SEQ 17 GTTGCGGA-G ATCAAGGATA TCGTGCAAAA GTTTGCGGTG ACGGCGAGGA TCACGGCCGA GG CCGGG TTCAATGGCG TGGAGATCCA TGCGGCGCAT 

IeS is gSgCGGA-G A?SaGGATA TCGTGCAAAA GTTTGCGGTG ACGGCGAGGA TCACGGCCGA GG— CCGGG TTCAATGGCG TGGAGATCCA TGCGGCGCAT 

SEQ 20 GTTgKgGA-G ATTGAGGGAC TCGTCACCAG CTTTGTGGAC GCTGCCAAGC GTGCCATCGA GG CCGGC GTCGACATTA TTGAGATTCA CGGCGCTCAC 

fx G??S^GA-G aSgJ^GGAC TCGTCACCAG CTTTGTGGAC GCTGCCAAGC GTGCCATCGA GG— CCGGC GTCGACATTA TTGAGATTCA CGGCGCTCAC 

SEQ 23 GAAAAfiS-A TATGMGAAT TAGTGGATAA GTTTGTTGTT GCTGCGAAGC GTGCAGTTGA AA— TAGGT TTTGATGTAA TTGAAATTCA TGGCGCTCAT 

IS 25 A^SbS ?Sg?CT GGGTGAAGAA GTTCGCCGAG TCGGCCAAGA GGTCAAATCG A-^-GCTGGT TTTGACGTCA TTGAGATCCA CGCCGCTCA- 

SEQ 26 ACCGAGGR-G TCGAGGGTCT GGGTGAAGAA GTTCGCCGAG TCGGCCAAGA GGTCAAATCG AG CTGGT TTTGACGTCA TTGAGATCCA CGCCGCT--- 

SEQ 2a AAGGAGGA-T ATAgKgCAAC TCAAGAGCGA CTACGTGGAA GCGGCAAAAC GAGCCATCCA TG CTGGT TTCGATGTTA TCGAAATTCA TGCAGCTCAT 

IeQ 29 AAGGAGGA-T ATAGAGCAAC TCAAGAGCGA CTACGTGGAA GCGGCAAAAC GAGCCATCCA TG— CTGGT. TTCGATGTTA TfGAAATTCA TGCAGCTCAT 

SEQ 32 ' IIIIIII IIIIIIIIII IIIIIIIIII IIIIIIIIII 

IIq 3 6 I^GGAGcili I^CGAGgIIc ^^IIg^TCA OTTCTGGCT GCMoiwilc aWSCCAWCCG CGC TGGT TTTGATGTCA TCGAGATCCA TGCAGCTCAT 

SEQ 37 AAGGAGGA-T ATCGAGGAAC TCAAGAATGA CTTTCTGGCT GCAGCMAAAC GAGCCA«CCG CGC— TGGT TTTGATGTCA TCGAGATCCA TGCAGCTCAT 

SEO 39 TTGGATGA-T ATCGAGGCTT TCAAGAAGGC GTTTGGAGAG GCGGTCAAGC G6GCATTGAA GGC TGGA TTTGATGTTA TTGAGATTCA CAATGCTCAC 

SEQ 41 GTCAGAGA-G ATCAAGGAGA TGGTCCAAGA CTGGGCGACA GCAGCGAAAA GGGCGGTGAA AGC GGGC GTGGATGTAA TCGAAATCCA CGGCGCGCAT 

SEQ 43 AAGGAGGA-T ATTAAGGCGG TGATTGAGGG TTTTGCCCAC ACGGCCGAGT ACCTTGAAAA GGC CGGT TTCGACGGTA TCGAATTGCA CGCCGCCCAC 

Ieq 82 ^TGAAGA-G ATTGAACAGT TGAAGGAGGA CTTTGTTTCC GGTGTTCGTC GAGCGGTTGA AG— CAGGA TTTGACACTA TCGACTTCCA TTTCGCTCAC 

SEQ 84 ACCGAGGA-C ATCAACAAGT TGCAAGACAA ATTCGTTCAG TCGGCACGAT GGGCGTTTGA AG CTGGG TATGACTACG TCGAACTTCA CAGCGCTCAC 
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SEQ I 
SEQ 2 
SEQ 4 
SEQ 5 

SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 11 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 2 6 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 3 6 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
5E0 82 
SEQ 84 



G6A.TACCTGC TGTCGTCATT CCTCTCGCCG GCCGCCAAC- — " ' — 

GGATACCTGC TGTCGTCATT CCTCTCGCCG GCCGCCAAC • 

GGCTATCTTC TGATGTCGTT CCTCTCCCCT GCGGTCAAC- ■ — — ~ — " 

GGCTATCTTC TGATGTCGTT CCTCTCCCCT GCGGTCAAC- 

GGCTATCTCA TCAACGAGTT CCTGAGCCCG GTCACGAAT • - 

GGGTATCTTC TCTCGTCTTT CCTATCACCG TCTTCCAAC- — 

GGTTATTTGA TTAATGAGTT CTATAGTCCT ATTTCAAAC- ' ' ' 

GGATGTTTAA TACACCAATT TTTAAGTAAA TTGACAAAC- — " — " 

GGCTACCTCA TTTCCGAGTT CTTGAGCCCC ATCTCCAAC- > 

GGATACCTGT TGGCGCAGTT CTTGAGCAAG AAGACAAAC- ■ " ' 

GGATACCTGT TGGCGCAGTT CTTGAGCAAG AAGACAAAC- "* ~ — — — — — 

ggttacctga tcaccgagtt cctttcgccg ctatcaaacg taagtggaga tactttgtgt ggggctgtgc gcatactccc tcgggtgtga cttctattaa 

ggttacctga tcaccgagtt cctttcgccg ctatcaaac- ' 

ggttatctta tatcgtcaac agttagtcct gccactaat- — " 



GGATATCTAC TGCATCAATT CTTGAGTCCG GTAAGCAAT- 
GGATATCTAC TGCATCAATT CTTGAGTCCG GTAAGCAAT- 



ggatacktgc ttcaccagtt cttgagtcca gtcagtaac- 

GGATACKTGC TTCACCAGTT CTTGAGTCCA GTCAGTAAC- 
GGATACCTCC TCCACGAATT CATCTGCCTG AGAGCAACA- 

gggtacctca tccacgaatt cctctcaccc attaccaac- 
ggttacctgc tggcccaatt cctgtccgaa acaaccaac- 

GGTTftlCTTG TTTCCAGCTT CCTGTCCCCT GCCACCAAC- 
GGATACCTGA TGCACTCGTT CCTCAGCCCG TTGACCAAT- 



130X 



1311 



1321 



1331 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ IB 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 3 6 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 62 
SEQ 84 



, AACCGCAC GGACCAGTAC GGCGGGTCGT TCGAGAACCG CATCCGGCTG TCTCTCGAGA TTGCGCAGTT GACTCGGGAC 

AACCGCAC GGACCAGTAC GGCGGGTCGT TCGAGAACCG CATCCGGCTG TCTCTCGAGA TTGCGCAGTT GACTCGGGAC 

ACGAGAAC AGACGAGTAC GGAGGCAGTT TTGAGAATCG CATCCGGCTG AGTCTGGAGA TCGCCAAGCT CACCCGCGAA 

« —ACGAGAAC AGACGAGTAC GGAGGCAGTT TTGAGAATCG CATCCGGCTG AGTCTGGAGA TCGCCAAGCT CACCCGCGAA 

AAGCGGAC GGATGCGTAC GGCGGGAGCT TTGAGAACCG GACCCGGATC GTGCGCGAGG TTGCGGCGGC TATTCGTGCG 

ACGCGCAC CGACGAGTAC GGCGGCTCCT TTGAGAACCG CATCCGGCTG TCTCTCGAAA TCGCCCAGGT CACCCGTGAC 

— AAGAGAAC AGATGAATAC GGTGGCAGTT TTGAAAATAG AACCAGATTT TTAAAGGAAG TTATCGATAG TGTTAAATCA 

AAGAGAGC TGACCAATAC GGGGGCTCAT TTGAAAACAG AGTTAGATTT CTTTTACAAA TAATTGAGAA TATAAAACGA 

CAGCGTAC GGACCAGTAC GGTGGCTCCT TCGAGAACCG CACCCGCGTT CTCCGCGAGA TCATCTCGGC CGTCCGCTCC 

, AGGCGCGG GGATGAGTAT GGCGGGTCGG CTGAGAACAG GGCGAgGATT GTTGGGGAGA TTATTAAGGA GTGCAGGAGG 

AGGCGCGG GGATGAGTAT GGCGGGTCGG CTGAGAACAG GGCGAGGATT GTTGGGGAGA TTATTAAGGA GTGCAGGAGG 

CATTTTATTT CCTGGCACGC AGAAACGGAC AGACAAGTAC GGCGGCAGCT TTGAGAACCG CACCCGGGTC CTGATCGATA TTATCAAGGC CGTCCGGGCA 

AAACGGAC AGACAAGTAC GGCGGCAGCT TTGAGAACCG CACCCGGGTC CTGATCGATA TTATCAAGGC CGTCCGGGCA 

GACCGCAA TGACAAGTAT GGTGGGACAT TTGAGAAACG TATTTTGTTT CCTATGGAAG TTGTCCATTC TGTTCGTAAA 







CGACGAGTAT 
CGACGAGTAT 
CGACGAGTAT 










GGTGGCAGTT TCGAGAACCG TATCAGAGTT GTCTTGGAAA TCCTTGACCT CATCCGCGCT 



-CAAAGAAC GGATGAGTAT GGTGGCAGCT TCGAGAACCG TATCAGAGTC GTCTTGGAGA TCATTG 

-CAAAGAAC GGATGAGTAT GGTGGCAGCT TCGAGAACCG TATCAGAGTC GTCTTGGAGA TCATTG ■ 

-CCAGGACC GACAAGTACG GGCGGAAGCT GGGAAAACCG CACTCGTCTG ACAATGGAAA GTCGTCGACC TTGTCCGCAG 
-CGCCGGAC AGATTCTTAC GGCGGTTCTT TCGAAAACCG TACCCGTCTA CTCATTGAAA TCGTAACAGC CGTCCGAGCC 
-CAGCGCAC CGACGAGTAC GGCGGCAGCC TCGAAAACCG CATGCGGCTA ATCCTCGAGG TCACGGCCGA GGTCCGCAGG 
-AAGCGTAC CGACAAGTAC GGAGGTAGCT TCGAGAACAG AfiTGCGCCTT GCTCTCGAGA TTGTCGAGGC TGCACGAGCT 
-CAGCGTAC CGACGAGTAC GGCGGTAGCC TG6AGAACCG CGCTCGATTT CTGCTCAACG TTGCCCGTCG AATCCGCCAA 



1441 



1451 



1481 



1491 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 2 6 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



GCCGTCGGCC CTCATGTGCC C — GTTTT CCTGCGCATT TCGGCCTCGG ACTGGTGCGA GGAGACCCTG CCGGA 

GCCGTCGGCC CTCATGTGCC C GTTTT CCTGCGCATT TCGGCCTCGG ACTGGTGCGA GGAGACCCTG CCGGA 

AATGTGCCCA AGGATATGCC T GTCTT CCTGCGGGTC TCCGCCACCG ATTGGCTGGA GGAGGTGCAG CCGAA 

AATGTGCCCA AGGATATGCC T GTCTT CCTGCGGGTC TCCGCCACCG ATTGGCTGGA GGAGGTGCAG CCGAA 

GTGATTCCCG A6GGGATGCC C CTGTT TCTGCGTATC AGCGCCACGG AGT6GTTGGA GGGTCAGCCG GTGGC 

GCCGTCGGCC CCAACGTTCC T GTTTT TCTCCGTGTC TCCGCGACGG ACTGGATCGA GGAGACCCTG CCCGA 

AGTATTCCAA ACGATGTTCC A GTGTT TTTGAGAATC TCTGCTGCTG AAAATAGTCC TGATCCA 

AAGATAGAAA CA CC G — ATTTT CTTAAAGTTT CCAATGTCAG ATAATTGTAG TGATCCG- 

GTCATCCCCG AGGACATGCC C CTCTT CGTCCGTGTC TCCGCCACCG AGTGGATGGA GTACACC— 

CAGGTGACTG AGGCGGTGGG TGAAGAGGAG GCGAAGAAGT TTGTGGTGGG AATCAAGCTG AACAGTGCGG ATTGGCAGGC GGGACGCGAT GGA A 

CAGGTGACTG AGGCGGTGGG TGAAGAGGAG GCGAAGAAGT TTGTGGTGGG AATCAAGCTG AACAGTGCGG ATTGGCAGGC GGGACGCGAT GGAAAG 

GTGATTCCCG AGGAGATGCC A CTCTT CGTCCGAATC TCCGCCACCG AATGGATGGA GTACGCCGGC 

GTGATTCCCG AGGAGATGCC A CTCTT CGTCCGAATC TCCGCCACCG AATGGATGGA GTACGCCGGC 

GCAATTCCAG ATAGTATGCC C — TTGTT TTATAGAGTA ACGGCTACAG ATTGGTTGCC CAAAGGACAA 



GCCATCCCCG AAACTACACC T- 



-GTCCT CGTTCGTGTC AGTGCAACTG ATTGGTTCGA GTTTGACTCT CAATTCAAAG 



CAXT 

GCGATGCCCT CCAGCATGCC T- 
CGGACGAGCA AGAATTTCAT C- 
GTTATGCCTG AGGACATGCC C- 
GAATTCCCCA ACAAGGGT 



-CTCTT CCTCCGCCTC TCCTCTACAG AATGGATGGA AGATACCGAC ATCGGC 

-CTCGG CATCAAAATT AACAGCGTCG AGTTCCAGGA GAAG 

-TTGTT CACTCGCATC AGTGGAACTG ACTGGCTGGA GAACAACCCT GAG 

-CTCTG GGTGCGCGTC AGCTCCACCG ACTGGGCCGA CCAAGCGCAC CAA 
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1501 1511 1521 1531 1541 1551 1561 1S71 1581 1591 



****** ********** *****#»»## 

SEQ 1 GCAGAGCTGG AAGTCGGAiGG ATACCGTGCG GTTCGCGCAG GAGCTGGTCA AGCAGGGCGC CGTTGATCTG ATCGATATCA GCAGCGGTGG 

SEQ 2 GCAGAGCTGG AAGTCGGAGG ATACCGTGCG GTTCGCGCAG GAGCTGGTCA AGCAGGGCGC CGTTGATCTG ATCGATATCA GCAGCGGTGG 

SEQ 4 CAA GCCCAGCTGG CGAGGCGTGG ACACTGTCCG ATTTGCGAAG ATCCTGGCAG AAACGGGTTA CGTTGACGTG CTTGACGTGA GCAGTGGCGG 

SEQ 5 CAA GCCCAGCTGG CGAGGCGTGG ACACTGTCCG ATTTGCGAAG ATCCTGGCAG AAACGGGTTA CGTTGACGTG CTTGACGTGA GCAGTGGCGG 

SEQ 7 -CGCGGAGTC GGGCAGCTGG GATAT GC AGAGCTCGCT GGAGCTGGTC AAGAAGCTGC CCGAATGGGG CATTGACCTG GTGGATGTCA GCTCCGCCGC 

SEQ 9 GGAA.TCGTGG AAGCTCTCTG ACTCCGTCCG CTTCGCCGAA GCCCTCGCTG CCCAGGGCGC TATTGACCTG ATCGACGTCT CTTCCGGCGG 

SEQ 11 -GAAGCTTGG ACTATTGAAG ATTCCAAAA- — AATTAGCT GACATTTTAG TAGAAAAGGG TATTGCTTTG GTTGATGTTT CATCTGGTGG 

SEQ 13 -GAAGCGTGG TCTACGGAAG ATGCATTGA- — AGTTGGCC GATCTTGTTA TTGATTTAGG AGTAAAGGTG ATCGACGTTA CATCAGGTGG 

SEQ 15 GGCCA GCCCTCGTGG GACCTCCAGC AGACCATTG- — AGCTCGCC AAGATCCTCC CCGACCTCGG CGTCGACCTC CTCGACGTCT CTTCCGGCGG 

SEQ 17 AGGAGGAGGA GGAGACGGAT ACGGCGGAGG AGGTGTTGA- — AGCAGATT GAGCTTTTTG AGCAGTGGGG GATCGACTTT GTCGAGGTTA GCGGTGGCAG 

SEQ la — GAGGAGGA GGAGACGGAT ACGGCGGAGG AGGTGTTGA- — AGCAGATT GAGCTTTTTG AGCAGTGGGG GATCGACTTT GTCGAGGTTA GCGGTGGCAG 

SEQ 20 GA GCCTAGCTGG GACCTCGAGC AGAGCACAC- — AGCTTGCC AAGCTCCTCC CGGACCTGGG TGTCGACCTG CTCGACGTCA GCTCGGGCGG 

SEQ 21 GA GCCTAGCTGG GACCTCGAGC AGAGCACAC- — AGCTTGCC AAGCTCCTCC CGGACCTGGG TGTCGACCTG CTCGACGTCA GCTCGGGCGG 

SEQ 23 GGATGG GAGATAGAAG ATACAGTTG- — CATTAGCA GCGAGGCTTC GCGATGGTGG TGTTGACTTG ATAGATGTTA GCTCTGGTGG 

SEQ 25 — — ■ ' 

SEQ 26 

SEQ 28 

SEQ 29 

SEQ 32 ACGAGTTTCC TGAAAGCTGG ACAGTCGAGC AGACTT G TCAACTCGCG CGTATCTTGC CCAAGCATGG AGTAGACTTG GTGGACGTCA GCTCAGGCGG 

SEQ 34 ■ 

SEQ 36 ■ 

SEQ 37 

SEQ 39 

SEQ 41 —AAGAAGTT CGGAAGCTGG GATGTCGAAA GCACGATCA AGATCTCC AAAATCCTGG CCGACTTGGG CGTTGATCTC CTCGACGTGT CTTCCGGTGG 

SEQ 43 GGTTTCAAG CCA GAGG AGGCGGTGC- — AGTTGTGC GAGGCCCTCG AGGCCGC6G6 CATGGATTTT GTCGAGACGA GCGGCGGCAC 

SEQ 82 — TACGAGGG AGAGACCTGG ACTCTTGAGC AGAGCATCA- — AGCTTGCA CACCAGTTAG CAGACCGTGG TGTCGATGTT TTGGATGTTT CCAGTGGTGG 

SEQ 84 GC CGACTCTTGG ACCGTTGACC AGACGGTTG AACTCGCC AAGATGCTCC AAGAGGCTCG AGTCGACCTG CTAGACGTCA GCTCGGGCGG 

1601 1611 1621 1631 1641 1651 1661 1671 1681 1691 



Mnmitiiiiifi\) MUM 

SEQ 1 TGTTCTCGCG CAG — 

SEQ 2 TGTTCTCGCG CAG — 

SEQ 4 CACTCATTCG GAG — 

SEQ 5 CACTCATTCG GAG ' 

SEQ 7 GAACCACAAG GAC ' 

SEQ 9 TGTCCACGCC GCG 

SEQ 11 TAACGATTAT AGA 

SEQ 13 AAATGTTGCG CAT 

SEQ 15 CAACAACAAG GAC ' 

SEQ 17 TTATGAGGAT CCTCAGGTAA GTTTTGGTGT TGTTTGAGGG ATGGGGCAAG GGGTTGTCTG TCGTGAACAA CAAAAGGGGC ACGGAACAAA TGCTAACGCC 

SEQ 18 TTATGAGGAT CCTCAG ' ■ 

SEQ 20 AAACTCGGTG GCC ' 

SEQ 21 AAACTCGGTG GCC " 

SEQ 23 TAATOWyVAG GAT ■ ' ' 

SEQ 25 ' ' 

SEQ 26 ' ■ 

SEQ 28 ■ ■ 

SEQ 29 

SEQ 32 TATCCATCCT AAG ' 

SEQ 34 ■ ■ 

SEQ 3 6 — 

SEQ 37 ■ 

SEQ 39 

SEQ 41 GAATCATCCT CAG ' ' 

SEQ 43 CTATGAGAGT TTT- ■ ' ' — ' 

SEQ 82 CATCCACAAG ATG 

SEQ 84 CCTGGTTCCA TTC — 

1701 1711 1721 1731 1741 1751 1761 1771 1781 1791 



mmmaa ummmm n^mii^ai^^ m 

SEQ 1 CAG AAGATCAAGT CCGGCCCTGC CTTCCAGGTG CCTTTTGCCG TGGCCGTGAA GAAGGCCGTC GGCGAC 

SEQ 2 CAG AAGATCAAGT CCGGCCCTGC CTTCCAGGTG CCTTTTGCCG TGGCCGTGAA GAAGGCCGTC GGCGAC 

SEQ 4 CAG CATATCCACG CGAAGCCAGG CTTCCAGGCA CCCTTTGCTA TTGCCGTCAA GAAGGCCGTC GGGGAC 

SEQ 5 CAG CATATCCACG CGAAGCCAGG CTTCCAGGCA CCCTTTGCTA TTGCCGTCAA GAAGGCCGTC GGGGAC 

SEQ 7 —CAG AAGATOiACC TGCACACGGC CTACCAGACG GACCTGGCC6 GGCAGATTCG CCAGGCCATC CGAGCG 

SEQ 9 CAG AAGATCAAGT CCGGGCCGGC TTTCCAGGCT CCCTTCGCTG TGGCTATCAA GAAGGCCGTT GGCGAT 

SEQ 11 C AACCACCAAG ATCTGGGATC AGTAAAGAGT TGAGAGAGCC AATCCATGTT CCGTTGTCTC GTGCAATTAA ACAACATGTT GGTGAC 

SEQ 13 — T GCAAATCTAG ATATCTATTA AATGACGACA AACAACTACC TTCTCAAGTG CCCTTGGCTC GTAAATTGAA AAGCCACATT AGAAAC 

SEQ 15 CAG AAGATCAACG TCCACACCTA CTACCAGATC GACATGGCCG AGCAGATCCG CGCGGCCGTG CACGAGGCCG 

SEQ 17 ATACAGATGG CCAACGGTCC CAAGCCCGAA AAGTCCGAAC GCACCATGGC CCGCGAGGCC TTCTTCCTCG AGTTCGCCAA GATCATCCGC ACCAAfi T 

SEQ 18 ATGG CCAACGGTCC CAAGCCCGAA AAGTCCGAAC GCACCATGGC CCGCGAGGCC TTCTTCCTCG AGTTCGCCAA GATCATCCGC ACCAAG T 

SEQ 20 CAA AAGATCGAGC TCACGCCGTA CTACCAGATC GACCTGGCAG CCAAGATCCG CGAGGCCGTC GGCGAT 

SEQ 21 — CAA AAGATCGAGC TCACGCCGTA CTACCAGATC GACCTGGCAG CCAAGATCCG CGAGGCCGTC GGCGAT 

SEQ 23 CAA AGAATTGAGG TGAAGGATTG CTATCAAGTT CCTTTTGCCG AAAAGATTAA GGATCAAGTG AATGGA 

SEQ 25 

SEQ 26 

SEQ 28 — > - 

SEQ 29 

SEQ 32 -TCCGCCATC GCCATCAAGT CCGGTCCTGC TTACCAGGTA GACCTCGCCA AACAGGTAAA GAAGGCTGTT GGCGAT 

SEQ 3 4 

SEQ 36 

SEQ 37 ■ — • 

SEQ 39 

SEQ 41 CAG AAAATCAACA TGTTCAACAC C_ 

SEQ 43 — G GTTTTGCGCA CCGCAAGGAG TCCAGCCGCA AGCGGGAGAA CTATTTTATC GAGTTCGCCG AGGTCATCCG GAAGGCCGTC AAGCAC 

SEQ 82 ■ CAA AAGGTCGCTG CTGGTCCCGG TTACCAGGCA CCTCTTGCCA AGGCGATCAA GAAGTCAGTT GGAGAC 

SEQ 84 CAA AAAATCACCG TGG6AGCCGG ATACCAGCTA TTCGGAGCAA AAfiCCGTTCG CGATGCTCTG GCCAAA 
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SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 

SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ IS 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 36 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 84 



AAGCT GCTGGTTGCC GCCGTGGGTG CCATCACC — — — • AACG GCAAGCAGGC 

AAGCT GCTGGTTGCC GCCGTGGGTG CCATCACC — — AACG GCAAGCAGGC 

AAACT CGCAGTGGCA TCAGTGGGTA TGATTGCC — — AGCG CGCATTTGGC 

AAACT CGCAGTGGCA TCAGTGGGTA TGATTGCC — — — ' — ' -AGCG CGCATTTGGC 

GCTGG CGCGTCGACT CTTGTGGGTG CTGTAGGTCT GATCACCGAT TCGGAACAGG CGAGGGGACT AGTTCAGGGA GCGGACGAGG CGACTGCAGC 

AAGCT CCTXGTTGCG ACGGTGGGCA CGATCACG — AACG GTAAGCAGGC 

— — AAGTT ATTGGTCAGT TGCGTTGGTG GGCTTGAA — A AAGATCCTGA 

CGATG TTTGATCGCA TGCAGTGGAG GATTAGAT — — — — C GAGACATATT 

GCAAGCAGCT CCTCGTCGGT GCCGTCGGCT TGGTCACC — - — ■ TCG GCTGAGATCG CCAAGGAGAC CGTCCAGGAG AAGGAGGATG GCAGAGTCAC 

TCCCCAAGCT TCCTCTCATG GTCACCGGCG GCTTCCGC ■ ACTC GTCAGGGCAT 

TCCCCAAGCT TCCTCTCATG GTCACCGGCG GCTTCCGC— — ACTC GTCAGGGCAT 

AGGTT GCTCATAGGC GCGGTCGGCA ACATCAAC — ■ — ■ ACGG CTGACATTGC 

AGGTT GCTCATAGGC GCGGTCGGCA ACATCAAC— ■ ACGG CTGACATTGC 

AT ACTACTTGGC GCTGTCGGAA TGATCAGG — — — ■ GATG GTCTTACGGC 



-AGTGT ACTTGTTTCA GCAGTAGGTG GAATCAAG- 



~A CTGGACATCT 



^ATGGT GGTCTACACC ACCGGCGGCT TCAAGACG — 

AAGAT GTTGATCAGC ACTGTTGGTA GCATCAAG — 

— ATCGAACC CGACGCGTCC AAACGCATGC TCGTCGGGG- 



-GTGGGCG CCATGGTCGA 

ATAG GTACCCTTGC 

— CCGTGG GAATGATGGA 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 2 9 
SEQ 32 
SEQ 34 
SEQ 3S 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 82 
SEQ 6 4 



GAATCAG 

GAATCAG 

CAATTCC 

CAATTCC 

CGAGGCAATG 

GAACAAG 

ATTGCTCAAC 
TAAACTCGAT 
CATCCAGCGC 

GGAGGCC 

GGAGGCC 

GCGCGATGTC 
GCGCGATGTC 
6AATGAAATC 



ATTCTAG 

ATTCTAG 

TTGTTGG 

TTGTTGG 

CTGTCGGGAC 

CTGCTTG 

AAATATTTAG 
GAGTTTATTG 
GAGAACGGCG 

GCTTTGG 

GCTTTGG 

GTGGATGAGC 
GTGGATGAGC 
CTAOAAAGTG 



AGGAGCAG ■ ■ — ; 

AGGAGCAG— ' ■ ■ 

AGAAGGAC — 

AGAAGGAC • 

CTGAACCC — 

AGGAGGAG ■ ■ 

AAGAAGGA — ' 

CTAATGGT — 

CCAAGACT— 

AATCCGAT — • 

AATCCGAT~*~ . , . — — — ■ w.^^ — — — — ■ • 

AGGGCGCCGA GAAGGTGGCC GAGGCCAAGC AGACGCATGA CACCATCGAG GTCGTGAGCG AATCACATGG CGGCAAGACC 
AGGGCGCCGA GAAGGTGGCC GAGGCCAAGC AGACGCATGA CACCATCGAG GTCGTGAGCG AATCACATGG CGGCAAGACC 
GAAAAGCT 



TGCTGAA GA6GTTT TGCAATCT- 



CGCGCTGCAG GGCGTCGATG GG 

GGAGGA6— ATCATCG CTGGAGGAGA GGACGATACC 

AGGTTCC — -TACGATT CGCCCAAC ' 



2061 



SEQ 1 
SEQ 2 
SEQ 4 
SEQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 2 3 
SEQ 23 
SEQ 2 6 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 3 6 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ 62 
SEQ 84 



GATATCGACG 
GATATCGACG 
GGACTGGACC 
GGACTGGACC 
AAGGCGGATG 
GGATTG6AT6 
ACATTTGATC 
GACTTTGATA 
CGTGCCGATA 
GATTGCGACA 
GATTGCGACA 
AAGGCGGATC 
AAGGCGGATC 
GATG 



TTGCGCTGGT 
TTGCGCTGGT 
TTGTGCTGGT 
TTGTGCTGGT 
CCATTCTGAT 
TTGCGCTTGT 
TTGCTTTGAT 
TAGCATTGAT 
TGGTCCTTGT 
TGATCGGTAT 
TGATCGGTAT 
TGGTCCTCAT 
TGGTCCTCAT 
TTACTTTTGT 



TGGCCGTGGG 
TGGCCGTGGG 
TGGACGTGGC 
TGGACGTGGC 
AGCCCGTCAG 
GGGACGTGGT 
CGGTAGAGGA 
AGGTAAAGGA 
TGCCAGGCAG 
CGGACGCCCG 
CGGACGCCCG 
TGCTCGCCAG 
TGCTCGCCAG 
CGCAAGGGAG 



TTCCAGAAGG 
TTCCAGAAGG 
TTCCAGAAGA 
TTCCAGAAGA 
TTCCTGCGCG 
TTCCAGAAGG 
TTTTTAAGAA 
TTTCTCAAAA 
TTCTTGAAGG 
GCCATCATCA 
GCCATCATCA 
TTCCTGCGCG 
TTCCTGCGCG 
TTCTTAAGGA 



ATCCCGGTCT 
ATCCCGGTCT 
ACCCGGGGCT 
ACCCGGGGCT 
AGCCAGAATG 
ATCCCGGTCT 
ATCCAGGTTT 
ACACTG6ATT 
AGCCCGAGTT 
ACCCTTCGCT 
ACCCTTCGCT 
AGCCTGAGTT 
AGCCTGAGTT 
ACCCGTCGTT 



GGCCTGGACG 
GGCCTGGACG 
GGTGTGGGCG 
GGTGTGGGCG 
GGTGTTTTCC 
GGCGTGGACT 
GGTATGGGAG 
GATCAGCCGT 
CGTCCTCACT 
TCCCGCCAAC 
TCCCGCCAAC 
TGTGCTGAGG 
TGTGCTGAGG 
GGTGCTAGAC 



TTTGCTCAGC 
TTTGCTCAGC 
TGGGCCGACG 
TGGGCCGACG 
ACGGCGAGAA 
TTCGCGCAGC 
TTTGCCGATA 
ATTGCTGACC 
GTCGCCGACG 
TTGATCCTCA 
TTGATCCTCA 
ACGGCGCATA 
ACGGCGCATA 
AGCGCGAACC 



ACCTCGGCGT C- 
ACCTCGGCGT C- 
AGCTGAATGT A- 
AGCTGAATGT A- 
AGTTGGGCGT G- 
ATCTTGATGT T- 
AACTTGGTGT T- 
AATTGCAAGC A- 
AGTTGGGTGT T— 
ACCCGGAGGT G- 
ACCCGGAGGT G- 
ACCTTGGGGT C- 
ACCTTGGGGT C- 
AGTTGGGTGA A- 



GGTATCGACA TTGTGAGGGC TGGACGTTGG TTCCAACAGA ATCCTGGTCT GGTTCGAGCT TTTGCTAACG AGCTTGGCGT G — 



— ATAGGCAT CGGGCGCGCA GCCGGTTCGG AGCCGGACCT CGCCAAGGAC ATCATCGCGG GCAAGGTGTC CAGCATTATC AAATACGCCA 

CCCTTGGATC TTGTGGCTTC AGGCCGTCTG TTCCAGAAGA ACACTGGACT TGTTTGGTCA TGGGCTGACG ATCTGAACAC T 

GGCCAAGACC GCAGCCAGAT TG6CAAGTTG GCCGAGCAGT CGATTCAGAG CGGAGAGTGT GATGCGGTAC TGTTGGCACG T GGATTGA 
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SEQ 1 GAAA TCTCCATGGC CAACCAGATC CGCTGGGGCT TCACCCGGCG TGGAGGCACC CCGTACATTG ATCCTTCGGT 

SEQ 2 GRMl TCTCCATGGC CAACCAGATC CGCTGGGGCT TCACCCGGCG TGGAGGCACC CCGTACATTG ATCCTTCGGT 

SEQ 4 GAGA TCTCCATGGC TAATCAGATC CGATGGGGTT TCTCGCGGCG CGGTGCTGGT CCTTACCTCA GGAAGAAACT 

SEQ 5 GAGA TCTCCATGGC TAATCAGATC CGATGGGGTT TCTCGCGGCG CGGTGCTGGT CCTTACCTCA GGAAGAAACT 

SEQ 7 CCGG TGACTGTCCC GGTGCAGTTT GGCAGGGCCA TTTAG " 

SEQ 9 GAGA TTGCGATGGC 6AGTCAGATT CGGTGGGGAT TCACAAGGCG CGGGGGCACG CCTTATATCG ACCCCAAAGC 

SEQ II ' AGAC TCCACCAGGC CTTGCAGTTA GGTTGGGGTT TCTGGCCCAA CAAACAACAA ATTGTTGATT TGATTGAAAG 

SEQ 13 CAAT TCAGAACAGC ACCTCAATAT AAGTTGGCCT TATCATAA — 

SEQ 15 ■ GATG TCAAGGCCCC TGTTCAGTAC CTCCGTGGTC CTCTTAGCAG CAGGCCCAAG AAGTTGACCA CTGTTCCTTA 

SEQ 17 CCGG ATGCGGATGC CCGCTTGTTC GACAAGAAGA GGGCTGAGCC GCACTGGATC GTTGAGAAGT TGGGCATGAA 

SEQ 13 ' CCGG ATGCGGATGC CCGCTTGTTC GACAAGAAGA GGGCTGAGCC GCACTGGATC GTTGAGAAGT TGGGCATGAA 

SEQ 20 AATG TGCAGTGGCC TCACCAATAC CACAGAGCAG TGTGGCGCAA GGGTGCAAGG ATTTGA 

SEQ 21 • ■ AATG TGCAGTGGCC TCACCAATAC CACAGAGCAG TGTGGCGCAA GGGTGCAAGG ATTTGA ' 

SEQ 23 AATG TTGCATGGCC AGTTCAGTAT GACTATGCAG TTAAGGGACA CA6AAAGTTA CGTTGA 

SEQ 25 

SEQ 26 ■ ■ 

SEQ 28 

SEQ 29 

SEQ 32 ' GAGG TCAAGATGGC GAACCAGATT GATTGGAGCT TCAAGGGACG TGGAAAGAAA GTGAACAAGA GTTCTTTATA 

SEQ 34 • •- — " 

SEQ 36 

SEQ 37 

SEQ 39 — ' 

SEQ 41 • ' 

SEQ 4? TGGGGGAGGA CGAGTTTGTG CTGCAGTTGA CTGCCTGCTC GGCGCAAATA AGGCTGATGG CCAAGGGCGA GGAGCCGTTT GAC 

SEQ 82 ■ TCTA TCCAGATCGC TCATCAGATC GCATGGGGTT TCGGTGGCAG AGCTAAGAAG AACGCTCCCA AGCTTGTCTT 

SEQ 84 TGTCCTACCC AAGCTGGACC GAGGATGCTA GTGTAGCGCT GATGGGTACC AGGGCAGCTG GCAACCCGCA GTACCATCGC GTTCACGTGG CTAAGAAGTG 

2201 2211 2221 2231 2241 2251 2261 2271 2281 2291 



SEQ 1 GTACAAGCAG TCTATTTTCG ATGTATAfi — ~ " 

SEQ 2 GTACAAGCAG TCTATTTTCG ATGTATAG— 

SEQ 4 CGAGAAGATA TAA 

SEQ 5 CGAGAAGATA TAA 

SEQ 7 ' 

SEQ 9 TTATAAGGAG AGCATCTTTG AGTAA 

SEQ 11 AACATCTAAA TTAGAAGTAA ATTAG 

SEQ 13 

SEQ 15 A 

SEQ 17 GTCCATTGTT GGTGCTGGTG TTGAGGTGGT ACGTCACGTT CCAACCCCAT TTGCTTCATT GTGTTTCCGA GTATGTCATG CTGACTTGGT TCTTTTCTAfi 

SEQ 18 GTCCATTGTT GGTGCTGGTG TTGAGGTG — — ' ' 

SEQ 20 • 

SEQ 21 

SEQ 23 

SEQ 25 ■ ' 

SEQ 26 

SEQ 28 " • • 

SEQ 29 

SEQ 32 G • 

SEQ 34 

SEQ 36 • ■ 

SEQ 37 ~ 

SEQ 39 — — 

SEQ 41 . 

SEQ 43 ' ATCTC AAACGCCGAC GAGGTGGCGC GGGTGACGCA GTTGATGGCG 

SEQ 82 A ■ 

SEQ 84 A • — T 



SEQ 1 AGTATAGATA GAGTTGAAGA TGATACCTCA TAGACGATCA ATGGACCCTT GCATATTATT TCTCGTCTCC TGCGTATGTT CAAGGTATTC ACAGTAGCTG 

SEQ 2 AGTATAGATA GAGTTGAAGA TGATACCTCA TAGACGATCA ATGGACCCTT GCATATTATT T 

SEQ 4 ' 

SEQ 5 

SEQ 7 ' ■ " 

SEQ 9 

SEQ 11 

SEQ 13 

SEQ 15 ■ ' ' 

SEQ 17 ACGTGGTATG TGAGCGAGCT CAAGAAGCTG GCCAAGTTTT AG-' 

SEQ 18 ACGTGGTATG TGAGCGAGCT CAAGAAGCTG GCCAAGTTTT AG ' — ■ — ' 

SEQ 20 ■ 

SEQ 21 

SEQ 23 ' ' 

SEQ 25 ' 

SEQ 26 — — ' — " "* 

SEQ 28 " 

SEQ 29 

SEQ 32 ' • ■ 

SEQ 34 • ' 

SEQ 3 6 

SEQ 37 

SEQ 39 ' 

SEQ 41 ' ' 

SEQ 43 GAGGGCAAGG TG 

SEQ 82 ■ ■ ' — 

SEQ 84 " ' 
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5EQ 5 
SEQ 7 
SEQ 9 
SEQ 11 
SEQ 13 
SEQ 15 
SEQ 17 
SEQ 18 
SEQ 20 
SEQ 21 
SEQ 23 
SEQ 25 
SEQ 26 
SEQ 28 
SEQ 29 
SEQ 32 
SEQ 34 
SEQ 3 6 
SEQ 37 
SEQ 39 
SEQ 41 
SEQ 43 
SEQ B2 
SEQ 84 



13/17 

2401 2411 2421 2431 2441 2451 2461 2471 2481 2491 

SEQ 1 CGTCCTCTTA AGTTTCTCCG TCATTCGTTC TATTCTACTC CAATCGCAAC GCATGGCGAC CACGGATCGA GTCGAATTTC TCCGTCGTTC GTATCTGATC 

SEQ 2 — ' • 

SEQ 4 ■ — 

SEQ 5 ' 

SEQ, 7 — ' 

SEQ 9 ■ — ■ 

SEQ 11 ■ 

SEQ 13 ■ ' ■ ' ■ — 

SEQ 15 • 

SEQ 17 — 

SEQ 18 ■ ■ ' 

SEQ 20 — ■ ■ — 

SEQ 21 

SEQ 23 

SEQ 25 • 

SEQ 26 ■ ' 

SEQ 28 

SEQ 29 

SEQ 32 ' ' 

SEQ 34 

SEQ 36 ' 

SEQ 37 

SEQ 39 ■ 

SEQ 41 ■ ■ • 

SEQ 43 ' 

SEQ 82 

SEQ 8 4 ■ 7 ' 



2501 2511 2521 2531 2541 2551 2561 2571 2581 2591 

SEQ 1 AATATA2\AAA GCGGGGAATG GCTTGACCCC GCGCAGAATG TCGATCTCTT CGCAAACTCT CGGTGTATAG GACGCTCAGC AACGATCAAG G 
SEQ 2 



Figure 2, A multiple alignments of the 2031 OR nucleic acid 
sequence from A. fumigatus (SEQ 1/2) along with related 2031 ORs 
from other fungi and bacteria (see also Example 4) , Regions 1-11, 
marked with * or #, refer to regions conserved at the amino acid 
level between Ors but not OYEs . 

Fungal 2031 ORs are given by SEQ ID No,: SEQ ID Nos . 1, 2, 4, 5, 
and If A- fumigatus; SEQ ID No. 9, A.nidulans} SEQ ID Nos. 11 and 
13, .C. albicans; SEQ ID Nos. 15, 17 and 18, N, crassa; SEQ ID 
Nos. 20, 21 and 43, M. grisea; SEQ ID No. 23 (NP_595868), S. 
pomhe; SEQ ID Nos. 25 and 26, C. trifolii) SEQ ID Nos. 28, 29, 
31, 32 and 34, F. sporotrichioides ; SEQ ID Nos. 36, 37 and 82, F. 
graminearum? SEQ ID Nos. 39 and 41, M. graminlcola; SEQ ID No. 
84, U. maydls. 
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Oh 0.5h 1h 2h 
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4h 24h 
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^^^^^^ ^ ^ 



4^ ^ - ^ 



1^ ^ ^'^^^ 
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iff' 



B 



Figure 3. Recombinant 2031 OR. (A) Time course of recombinant 2031 OR induction over 24 
Incurs after tine addition of IPTG (samples without IPTG are also shown). The gel was stained 
with coomassie; A prominent band of the correct molecular weight (marked with an arrow) is 
seen. (B) Coomassie stained gel showing purified recombinant 2031 . 
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79 



I— 68 



4961; A, fumigatus 



100 



SEQ ID No. 43; M. grisea A 



SEQ ID No. 1-9; N. crassa 



100 



SEQ ID No. 14; C. albicans 



97 



90 



SEQ ID No. 12; C. alb/cans 

SEQ ID No. 24; S. pombe 

SEQ ID Nos. 30 + 33; F, sporotrichioides 

SEQ ID No. 6; A. fumigatus 
SEQ ID No. 3; A. fumigatus 



100 



— |ioo 



55 



SEQ ID No. 10; A, nidulans 
SEQ ID No. 8; A. fumigatus 



92 



SEQ ID No. 16; N. crassa 



97 



55 



71 



SEQ ID No. 22; M. grisea 

NP_295913 

NP_625402 

AF320254 

T44612 



Bacterial 



f — 6-2460; C. albicans 

100 

A3699G; C. albicans 



82 



NCU04452.1; N. crassa 
4875: A, fumigatus^ ^^^^ 



0YE3; S. cerevisiae 



100 



OYE2; S. cerevisiae 



93 



OYE1; S. cerevisiae J 



0.1 



Fungal 
2031 ORs 



J 



Figure 4. Phylogenetic tree showing relationships between A, fumigatus 2031 OR and similar 
proteins. This demonstrates a 2031 OR clade, which can be distinguished from the OYE 
proteins. 
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Figure 5: NADPH dehydrogenase activity of recombinant 2031 OR witfi cycloliexenone (CHX), 
N-ethylnnaleimide (NEM), menadione (MEN) or duroquinone (DQ) as substrates. Final 
concentrations in the assay were as follows: 500 \xM substrate, 120 |i.M NADPH, 1 |ag/200 p,L 
2031 OR. 
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Figure 6: Inhibition of 2031 OR function by two inhibitors (shown in A and B) identified by high- 
throughput screening. 



